Back to feed

Senior Java Development + Data Engineer

Remote Full-time Live

MUST HAVE:

  • Must have a minimum of 10-12 years of hands-on development experience implementing batch and events driven applications using Java, Kafka, Spark, Scala, PySpark and Python.
  • Experience with Apache Kafka and Connectors, Java, Springboot in building event driven services, Python in building ML pipelines.
  • Develop data pipelines responsible for ingesting large amounts of different kinds of data from various sources.
  • Help evolve data architecture and work on Next Generation real time pipeline algorithms and architecture in addition to supporting and maintaining current pipelines and legacy systems.
  • Write code and develop worker nodes for business logic, ETL and orchestration processes.
  • Develop algorithms for better attribution rules and category classifiers.
  • Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive search, discovery, and recommendations.
  • Work closely with architects, engineers, data analysts, data scientists, contractors/consultants and project managers in assessing project requirements, design, develop and support data ingestions and API services.
  • Work with Data Scientists in building feature engineering pipelines and integrating machine learning models during the content enrichment process.
  • Able to influence on priorities working with various partners including engineers, project management office and leadership.
  • Mentor junior team members, define architecture, code review, hands-on development and deliver the work in sprint cycle.
  • Participate in design discussions with Architects and other team members for the design of new systems and re-engineering of components of existing systems.
  • Wear an Architect hat when required to bring new ideas to the table, thought leadership and forward thinking.
  • Take a holistic approach to building solutions by thinking of the big picture and overall solution.
  • Work on moving away from legacy systems into next generation architecture.
  • Take complete ownership from requirements, solution design, development, production launch and post launch production support. Participate in code reviews and regular on-call rotations.
  • Desire to apply the best solution in the industry, apply correct design patterns during development and learn best practices and data engineering tools and technologies.
  • Performs any other functions and duties assigned and necessary for the smooth and efficient operation

EDUCATION & EXPERIENCE:

  • BS or MS in Computer Science (or related field) with 12+ years of hands-on software development experience working in large-scale data processing pipelines.
  • Must have skills are Apache Spark, Scala and PySpark with 2-4 years of experience building production grade batch pipelines that handle large volumes of data.
  • Must have at least 8+ years of experience in Java and API / Microservices.
  • Must have at least 5+ years of experience in Python.
  • 5+ years of experience in understanding and writing complex SQL and stored procedures for processing raw data, ETL, data validation, using databases such as SQL Server, Redis and other NoSQL DBs.
  • Knowledge of Big data technologies, Hadoop, HDFS.
  • Expertise with building events driven pipelines with Kafka and Java / Spark.
  • Expertise with Amazon AWS stack such as EMR, EC2, S3.
  • Experience working with APIs to collect and ingest data as well build the APIs for business logic.
  • Experience working with setting up, maintaining, and debugging production systems and infrastructure.
  • Experience in building fault-tolerant and resilient systems.
  • Experience in building worker nodes, knowledge of REST principles and data engineering design patterns.
  • In-depth knowledge of Java, SpringBoot, Spark, Scala, PySpark, Python, Orchestration tools, ESB, SQL, Stored procedures, Docker, RESTful web services, Kubernetes, CI/CD, Observability techniques, Kafka, Release processes, caching strategies, versioning, B&D, BitBucket / Git and AWS Cloud Ecosystem, NoSQL Databases, Hazelcast.
  • Strong software development, architecture diagramming, problem-solving and debugging skills.
  • Phenomenal communication and influencing skills

NICE TO HAVE:

  • Exposure to Machine Learning (ML), LLM models, using AI during coding, build with AI.
  • Knowledge of Elastic APM, ELK stack and search technologies such as Elasticsearch / Solr.
  • Some experience in workflow orchestration tools such as Air Flow or Apache NiFi.

Apply tot his job Apply To this Job

On the same wavelength