companyDatabricks logo

Staff Software Engineer - Distributed Data Systems

DatabricksSan Francisco, California
On-site Full-time $192K/yr - $260K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

We are looking for software engineers who possess: Strong programming skills in languages such as Scala, Java, or Python. Experience with distributed systems and cloud computing. Familiarity with big data technologies and architectures. Problem-solving skills and the ability to work collaboratively in a dynamic environment. Knowledge of machine learning concepts is a plus.

About the job

P-186

At Databricks, we are passionate about empowering data teams to tackle some of the world’s most challenging problems, from security threat detection to cancer drug development. Our mission is to build and operate the leading data and AI infrastructure platform, enabling our customers to concentrate on the high-value challenges that are integral to their own objectives.

Founded in 2013 by the original creators of Apache Spark™, Databricks has rapidly evolved from a small office in Berkeley, California, to a global powerhouse with over 1000 employees. Trusted by thousands of organizations, from startups to Fortune 100 companies, we are recognized as one of the fastest-growing SaaS companies worldwide.

Our engineering teams create highly sophisticated products that address significant needs in the industry. We continuously push the limits of data and AI technology while maintaining the resilience, security, and scalability essential for our customers' success on our platform.

We manage one of the largest-scale software platforms, consisting of millions of virtual machines that generate terabytes of logs and process exabytes of data daily. At this scale, we frequently encounter cloud hardware, network, and operating system faults, and our software must effectively shield our customers from these challenges.

Modern data analysis leverages advanced techniques, such as machine learning, that far exceed the capabilities of traditional SQL query engines. As a Software Engineer on the Runtime team at Databricks, you will be instrumental in developing the next generation of distributed data storage and processing systems that outshine specialized SQL query engines in relational query performance, while providing the flexibility and programming abstractions to support a variety of workloads, from ETL to data science.

Examples of projects you may work on include:

  • Apache Spark™: Contributing to the de facto open-source framework for big data.
  • Data Plane Storage: Developing reliable, high-performance services and client libraries for storing and accessing vast amounts of data on cloud storage backends like AWS S3 and Azure Blob Store.
  • Delta Lake: A storage management system that merges the scalability and cost-effectiveness of data lakes with the performance and reliability of data warehouses, featuring low latency streaming. Its higher-level abstractions and guarantees, including ACID transactions and time travel, significantly reduce the complexity of real-world data engineering architectures.
  • Delta Pipelines: Aiming to simplify the management of data engineering pipelines.

About Databricks

Databricks is a leading data and AI platform company founded by the original creators of Apache Spark™. With a commitment to innovation and excellence, we help organizations solve complex data challenges.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.