JEET ANALYTICS PTE. LTD.
Posted: March 06, 2026
Quantexa Data Engineer
We are seeking a talented and experienced Data Engineer with expertise in Hadoop, Scala, Spark, Elastic, Open Shift Container Platform (OCP) and DevOps practices. Elasticsearch to join our team. As a Data Engineer, you will play a crucial role in designing, developing, and optimizing big data solutions using Apache Spark, Scala, and Elasticsearch. You will collaborate with cross-functional teams to build scalable and efficient data processing pipelines and search applications. Knowledge and experience in the Compliance / AML domain will be a plus. Working experience with Quantexa tool is a must.
Responsibilities:
• Implement data transformation, aggregation, and enrichment processes to support various data analytics and machine learning initiatives
• Collaborate with cross-functional teams to understand data requirements and translate them into effective data engineering solutions
• Design, develop, and implement Spark Scala applications and data processing pipelines to process large volumes of structured and unstructured data
• Integrate Elasticsearch with Spark to enable efficient indexing, querying, and retrieval of data
• Optimize and tune Spark jobs for performance and scalability, ensuring efficient data processing and indexing in Elasticsearch
• Implement data transformations, aggregations, and computations using Spark RDDs, DataFrames, and Datasets, and integrate them with Elasticsearch
• Develop and maintain scalable and fault-tolerant Spark applications, adhering to industry best practices and coding standards
• Troubleshoot and resolve issues related to data processing, performance, and data quality in the Spark-Elasticsearch integration
• Monitor and analyze job performance metrics, identify bottlenecks, and propose optimizations in both Spark and Elasticsearch components
• Ensure data quality and integrity throughout the data processing lifecycle
• Design and deploy data engineering solutions on OpenShift Container Platform (OCP) using containerization and orchestration techniques
• Optimize data engineering workflows for containerized deployment and efficient resource utilization
• Collaborate with DevOps teams to streamline deployment processes, implement CI/CD pipelines, and ensure platform stability
• Implement data governance practices, data lineage, and metadata management to ensure data accuracy, traceability, and compliance
• Monitor and optimize data pipeline performance, troubleshoot issues, and implement necessary enhancements
• Implement monitoring and logging mechanisms to ensure the health, availability, and performance of the data infrastructure
• Document data engineering processes, workflows, and infrastructure configurations for knowledge sharing and reference
Requirements:
• Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field
• Must be Quantexa certified data engineer / data architect and proficient with the tool.
• Proven experience as a Data Engineer, working with Hadoop, Spark, and data processing technologies in large-scale environments
• Proficiency in Scala programming language and familiarity with functional programming concepts
• Experience with Quantexa tool is highly preferred.
• In-depth understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
• Strong expertise in designing and developing data infrastructure using Hadoop, Spark, and related tools (HDFS, Hive, Pig, etc)
• Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes
• Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java
• Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket)
• Experience with Graphana, Prometheus, Splunk will be an added benefit
• Experience integrating and working with Elasticsearch for data indexing and search applications
• Solid understanding of Elasticsearch data modeling, indexing strategies, and query optimization
• Experience with distributed computing, parallel processing, and working with large datasets
• Proficient in performance tuning and optimization techniques for Spark applications and Elasticsearch queries
• Strong problem-solving and analytical skills with the ability to debug and resolve complex issues
• Familiarity with version control systems (e.g., Git) and collaborative development workflows
• Excellent communication and teamwork skills with the ability to work effectively in cross-functional teams
• Experience with cloud platforms (e.g., AWS, Azure, GCP) and their data services is a plus
Please click the "Apply Now" button below to submit your application on the employer's website.
Apply NowBenefits Summary: Up $3500 + Benefits 5 Days work week Central Responsibilities: Responsi...
View DetailsBenefits Summary: Flexible working hours Full commission scheme Work from home Sales driven...
View DetailsTrain Service Controller (Communications) Salary: Basic salary of $2250 to $2600 (subjected to work...
View Details