If you grow as an employee, the company grows as well. That’s why we decided to setup the Big Industries Academy. The Academy is a platform for continuous learning by sharing expertise and knowledge within the team. One of the Academy’s initiatives is the IOIntellia internal project, which team members can join on a voluntary basis.
IOIntellia is a fictitious social media analytics startup who wants to setup a data lake platform in order to gain a wider footprint in the social media analytics domain. IOIntellia wants to use effective, scalable and easily portable solutions for data collection, storage and analytics. Big Industries has been invited to design and build that platform.
The plan consists of the following steps:
• Setup an on-prem data lake
• Setup social media crawlers to collect and feed data to data lake:
• Batch Mode
• Streaming Mode
• Setup data flows to receive data provided by crawlers and store them in a data lake
• Design data marts specific to the needs of following business areas:
• Sentiment analytics on Covid19
• Social media usage pattern during Lockdown across the countries
• Setup an enterprise grade security model
• Visualization dashboards on top of data marts
• Setup API’s on top of data lake to provide “Data as a Service”
• Setup of data mart for travel patterns during lockdown across the world
• Link social media data with travel data and include in the visualization dashboard
• Proposal to migrate the entire solution to the cloud
• Value Analysis for major vendors: Microsoft Azure, AWS and Google Cloud
• Serverless architecture
• Tool selection process for each component built in on-premise solution
• Migration
• Data Migration
• Application logic migration
• Document the migration strategy – to be used as “Big Industries on-prem to cloud migration methodology”
This internal learning initiative is run like a “real project” with Scrum meetings on a weekly basis. Epics and St
ories are written and Issues are assigned and tracked in the project management tool Jira, while the documentation like release notes, status reports are stored and displayed in Confluence.
The overall project work is organised through the use of Epics, which are broken down into a number of smaller Stories/Issues which are distributed amongst the team members.
· Define the Project Governance
· Proposal for Platform setup architecture
· Proposal for Data Pipeline
· Proposal for CI/CD
· Build a sentiment analytics engine to identify the sentiment patterns across different genders, age groups, geographies and time of day, day of the week, week & months
· Build a Data Mart on top of the data lake for various known target use cases
· …
At Big Industries we share the following values;
Share your Wisdom, Dare to Grow and Feed the Family Feel.
We truly believe that with the IOIntellia project we are actively putting these values into action.