close

Nvidia Corporation

Apply for this job

Senior DevOps Engineer (Finance)



NVIDIA is seeking a passionate, motivated and technical Architect/Engineer to join its dynamic and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Principal DevOps & SRE Engineer to support the design and implementation of AI tools solutions on Kubernetes for the company's Cloud Platform. The position will be part of a fast-paced crew that develops and maintains sophisticated build & test environments for a multitude of hardware platforms both NVIDIA GPUs and Tegra Processors along with various operating systems (Windows/Linux/Android). The team works with various other business units within NVIDIA Software such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence, Robotics and Autonomous cars to cater to their infrastructure & system's needs.

What you'll be doing:

  • Craft the overall architecture for integrating coding assistance & Trustworthy AI tools into the existing infrastructure, ensuring alignment with reliable, scalable, and secure standard methodologies. Design for scalability, ensuring the implementation can support current and future workloads without degrading system performance
  • Identify and automate repetitive or toilsome production tasks related to code deployment, validation, and review, leveraging coding assistance tools to improve operational efficiency
  • Implement robust monitoring and observability for coding assistance/Trustworthy AI tools & application services, ensuring their availability and performance within the production environment
  • Integrate security best practices throughout the development lifecycle, ensuring coding assistance tools do not introduce vulnerabilities or compliance risks
  • Collaborate closely with software engineers, product teams and security teams to align the coding assistance/Trustworthy AI tool's capabilities with organizational goals and developer needs. Establish feedback mechanisms to gather insights from developers, product/engineering teams on the effectiveness of coding assistance/Trustworthy AI tools, iterating on integrations and configurations for continuous improvement
  • Maintain comprehensive documentation for architecture decisions, integration processes, operational runbooks, and troubleshooting guides

What we need to see:

  • Kubernetes domain expertise with extensive experience building scalable, resilient platforms in both public and private cloud capable of providing platform engineering / architecture standard methodologies (including experience with architecting and implementing the overall platform, administration & configuration, orchestration, security, and monitoring ecosystem). Experience of maintaining cloud infrastructure (On-prem & CSP) and highly available production environment.
  • Strong Programming background in python and/or similar scripting languages. Excellent problem solving, communication, and teamwork skills
  • Strong understanding of architectural requirements and development processes involved in building reliable, robust, scalable data products and pipelines.
  • Demonstrating the ability to automate processes using Continuous Integration /Continuous Delivery (CI/CD) tools. Proficient in using Configuration as Code, infrastructure-as-code tools such as ansible, puppet, chef & terraform. Strong background with Gitlab, GitHub, Perforce, Jenkins and/or other CI/CD systems & Artifactory.
  • Experienced with data analytics/visualization & monitoring tools like Kibana, Grafana, Splunk, Zabbix, Prometheus and/or similar systems etc. Experience in Databases both SQL (MySQL) and NoSQL (Elastic Search /MongoDB/Cassandra).
  • 10+ years of proven experience with Bachelor's or Master's degree in computer science, Software Engineering, or equivalent experience

Ways to stand out from the crowd:

  • Solid understanding of containerization and microservices architecture. Certified Kubernetes Administrator (CKA), Certified Kubernetes Security Specialist (CKS) & Certified Kubernetes Application Developer (CKAD) preferred.
  • Prior experience on implementation and management of Trustworthy AI tools (QuantPi, Credo AI, Armilla AI) , Coding Assistance AI tools (Cursor, Sourcegraph Cody) & code review AI tools (CodeRabbit)
  • Thrives in a multi-tasking environment with constantly evolving priorities.
  • Ability to analyze complex problems into simple sub problems and then reuse available solutions to implement most of those. Ability to design simple systems that can work efficiently without needing much support.
  • Prior experience with large scale operations team. Experience with using and improving data centers. Background with computer algorithms and ability to choose the best possible algorithms to meet the scaling challenge.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.

The base salary range is 168,000 USD - 333,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Apply

Apply Here done

© 2025 American Indian Jobs