Stephen Gatchell is currently a Chief Data Officer Engineering Analytics & Data Lake at EMC and serves on the EMC Data Governance Office, Master Data Management and Business Data Lake Operating committee’s developing EMC’s corporate strategies for the Business Data Lake, Advanced Analytics and Information Asset Management. Stephen also serves as a Customer Insight Analyst for the Chief Technology Office analyzing customer technology challenges and requirements.
- A data lake is a way to democratize data. It can be structure, unstructured or semi-structured including all of the data assets including visualizations, tables and views.
- The idea of a data lake is to break down the silos between the departments and come up with innovative solutions across a variety of different sources. The use cases are very important.
- Historically, “big data” was 1 GB and all of it was managed by IT.
- The data would be imported into a BI system in order to analyze.
- There was no drill down into the raw data from a report.
- Today data analysts are at all levels of the organization in all departments.
- HR, legal, engineering, interns and VPs are all analysts.
- Now we need real-time data with access across the organization.
- The people, process and technology have all progressed over the last few decades.
- People who want to drill down to get to the results.
- Process to get the analytics updated in real time.
- Open source technology allows anyone to build their own database.
- A data scientist is a subject matter expert (SME) that understands the data and also can code.
- They don’t need to have a mathematical background but they are studying statistics and want to generate a D3 visualization.
- Rogue IT is a good thing (as long as they aren’t a security risk). We want the business to understand technology.
- IT needs to keep the lights on and support the business in finding new tech.
- IT needs to look for end-to-end solutions that help many groups across the org.
- Data lake, visualization, ingestion tools for example.
- The next things in Big Data are data governance, MDM and data quality.
- What is the value of this data to our business?
- Natural Language Query (NLQ) will enable a no-UI analytics engine.
- Flash storage will allow petabytes to be searched very quickly.
- Always learn, build connections and be as persistent as possible.
- Minio Cloud Storage – Object storage server built for developers and devops.
- BookMoreNights.com – You’ve got the best vacation rental property, show it off!