DGX Cloud Infrastructure Engineering Intern - Fall 2025 (nvidia)
Job posting number: #227547 (Ref:JR1994669)
Job Description
NVIDIA is hiring engineers to scale up its AI Infrastructure. We expect you to have a strong programming background, a deep understanding of distributed systems, familiarity with software testing and deployment, and excellent communication and planning abilities. We also welcome out-of-the-box thinkers who can provide new ideas with strong at execution bias. Expect to be constantly challenged, improving, and evolving for the better. You and other engineers in this team will help advance NVIDIA's capacity to build and deploy leading infrastructure solutions for a broad range of AI-based applications that affect core data science. What are you waiting for if you're creative, passionate about what you do, and love having fun apply today!
What you’ll be doing:
- We are designing and architecting a comprehensive platform that automates GPU asset provisioning, configuration, and lifecycle management across cloud providers.
- Design, develop, test, debug, and optimize creative solutions for Datacenter firmware throughout lifecycle.
- Work closely with hardware, software, infrastructure, and business teams to transform new firmware features from idea to reality.
- Define server-level reliability, availability, and serviceability requirements in collaboration with various customers like CSPs and deliver fault resilient solution at scale as per customer expectations.
- Collaborate with hardware, software and firmware teams to drive failure analysis and large scale solution deployment.
- Work with engineering teams across NVIDIA to ensure your software integrates seamlessly from the hardware all the way up to the AI training applications.
What we need to see:
- Currently pursuing a Bachelor's, Master's, or PhD degree within Computer Engineering, Electrical Engineering, Computer Science, or a related field
- Course or internship experience related to the following areas required: Computer Architecture, Deep Learning or Machine Learning, GPU computing and Parallel Programming, Performance Modeling, profiling, optimizing, and/or analysis.
- Prior experience or knowledge required on the following programming skills and technologies: C, C++, Python, Perl, GPU Computing (CUDA, OpenCL, OpenACC), Deep Learning Frameworks (PyTorch, TensorFlow, Caffe), HPC (MPI, OpenMP)
You will also be eligible for Intern benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.