Job Duties and Responsibilities: Lead a team of data engineers; Collecting and analyzing raw data from various sources, including social media platforms; Organize and maintain datasets; Improving data quality and process efficiency; Design and manage data ETL pipelines that encompass the journey of data from source to destination systems, processing 10 million+ data points daily, utilizing Kafka for real-time data streaming and MongoDB for NoSQL storage across Kubernetes clusters; Design and deploy scalable microservices in Python and Golang, leveraging FlaskAPI, GraphQL, and Docker, ensuring sub-second response times and efficient concurrency with goroutines; Migrated large amounts of data from legacy databases to MongoDB to achieve sub-second access latencies and optimize storage for unstructured data through Elasticsearch integration; Set up and manage the infrastructure required for ingestion, processing, and storage of data; Evaluate the model needs and objectives, interpret trends and patterns of data; Conduct complex data analysis and report on results; Prepare data for analysis and reporting by transforming and cleansing it: Combining raw information from different sources; Explore ways to enhance data quality and reliability; Identify opportunities for data acquisition; Develop analytical tools and programs; Collaborate with teams at COSMOS on several projects; Managing services and operational infrastructure for system reliability and resiliency; Creating continuous integration continuous deployment (CI/CD) pipelines with Jenkins and GitLab CI for automating service/system deployment; Integrate Prometheus for monitoring, Grafana for real-time dashboarding/visualization, and log analysis with Kibana sourced from Elasticsearch; Front-end development (HTML/CSS, JavaScript, Node.js, etc.); Training machine learning (ML) models on datasets; Creating continuous integration continuous deployment (CI/CD) pipelines with Jenkins and GitLab CI for automating service/system deployment: Integrate Prometheus for monitoring, Grafana for real-time dashboarding/visualization, and log analysis with Kibana sourced from Elasticsearch; Front-end development (HTML/CSS, JavaScript, Node.js, etc.); Deploying machine learning (ML) models; Enhance the system's fault tolerance by incorporating alert mechanisms; Develop frameworks like Spring Boot, React: Work on other tasks as asked. Knowledge, Skills, and Abilities: Expert proficiency level in working with data models, data pipelines, ETL processes, data stores, data mining, and segmentation techniques; Expert proficiency level in working with programming/scripting languages (e.g., Java and Python); Expert proficiency level in working with data integration platforms and SQL database design; Expert proficiency level in working with numerical, analytical, and data security skills; Expert proficiency level in collecting raw data from various social media platforms; Expert proficiency level in creating CI/CD pipelines; Expert proficiency level with front-end development (HTML/CSS, JavaScript, Node.js, etc.); Expert proficiency level with training and deploying machine learning (ML) models on datasets; Ability to lead a large team of data engineers (5+ members); Expert proficiency level with Kafka and MongoDB for NoSQL storage across Kubernetes clusters; Expert proficiency level with microservices, Python, Golang, FlaskAPI, GraphQL, and Docker: Expert proficiency level with Elasticsearch, Grafana, Prometheus, and Kibana; Expert proficiency level in data modeling concepts (ERD, Dimensional Modeling, Data Vault) and data APIs (RESTful API); Expert proficiency level in data processing software (e.g., Hadoop, Spark, TensorFlow, Pig, Hive) and algorithms (e.g., MapReduce, Flume); Expert proficiency level in cloud platforms (AWS, Azure, GCP) and data warehousing solutions (Snowflake, Amazon Red Shift, Google BigQuery, Azure Synapse); Expert proficiency level in technical communications.