About the company
Appen is a leader in AI enablement for critical tasks such as model improvement, supervision, and evaluation. To do this we leverage our global crowd of over one million skilled contractors, speaking over 180 languages and dialects, representing 130 countries. In addition, we utilize the industry's most advanced AI-assisted data annotation platform to collect and label various types of data like images, text, speech, audio, and video. Our data is crucial for building and continuously improving the world's most innovative artificial intelligence systems and Appen is already trusted by the world's largest technology companies. Now with the explosion of interest in generative AI, Appen is helping leaders in automotive, financial services, retail, healthcare, and governments the confidence to deploy world-class AI products. At Appen, we are purpose driven. Our fundamental role in AI is to ensure all models are helpful, honest, and harmless, so we firmly believe in unlocking the power of AI to build a better world. We have a learn-it-all culture that values perspective, growth, and innovation. We are customer-obsessed, action-oriented, and celebrate winning together. At Appen, we are committed to creating an inclusive and diverse workplace. We are an equal opportunity employer that does not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Job Summary
Key Responsibilities:
šDesign, build, and manage large-scale data infrastructures using a variety of AWS technologies such as Amazon Redshift, AWS Glue, Amazon Athena, AWS Data Pipeline, Amazon Kinesis, Amazon EMR, and Amazon RDS. šDesign, develop, and maintain scalable data pipelines and architectures on Databricks using tools such as Delta Lake, Unity Catalog, and Apache Spark (Python or Scala), or similar technologies. šIntegrate Databricks with cloud platforms like AWS to ensure smooth and secure data flow across systems. šBuild and automate CI/CD pipelines for deploying, testing, and monitoring Databricks workflows and data jobs. šContinuously optimize data workflows for performance, reliability, and security, applying Databricks best practices around data governance and quality.
Qualifications:
š5-7 years of hands-on experience with AWS data engineering technologies, such as Amazon Redshift, AWS Glue, AWS Data Pipeline, Amazon Kinesis, Amazon RDS, and Apache Airflow. šHands-on experience working with Databricks, including Delta Lake, Apache Spark (Python or Scala), and Unity Catalog. šDemonstrated proficiency in SQL and NoSQL databases, ETL tools, and data pipeline workflows. šExperience with Python, and/or Java. šDeep understanding of data structures, data modeling, and software architecture. šExperience with AI and machine learning technologies is highly desirable. šStrong problem-solving skills and attention to detail.
The future of finance is here ā whether youāre interested in blockchain, cryptocurrency, or remote web3 jobs, thereās a perfect role waiting for you.




