Intellectica is recruiting an experienced and highly skilled Site Reliability Engineer on behalf of a newly established AI & Machine Learning startup, which will be integrated into a leading group of companies operating in the aerospace, defense, and high-technology sector. This startup aims to develop cutting-edge AI-driven solutions tailored to the aerospace and defense industry. You will join a dynamic cross-functional team and report directly to the Principal SRE.
Key activities and responsibilities of this role include:
- Implement, operate, and maintain infrastructure services across cloud-based and on-prem environments, under the guidance of Senior and Principal SREs
- Provision and manage containerized applications using Docker and Kubernetes, ensuring scalable, reliable, and smooth application deployments
- Automate infrastructure tasks using Terraform (HCL) and other modern DevOps tools to support repeatable and efficient operations
- Support CI/CD pipelines, enabling fast, reliable, and secure software releases through continuous integration and deployment, best practices
- Implement observability tools for metrics, logging, and alerting to ensure high availability and system reliability
- Assist in the setup and operation of secure networking, including firewalls, access controls, and VPNs, following established security guidelines
- Collaborate closely with ML Engineers and Data Scientists to understand infrastructure needs and deliver scalable solutions
- Participate in code reviews, pair programming, and knowledge-sharing sessions to elevate team practices
- Contribute to operational documentation, playbooks, and internal knowledge bases
- Assist in incident response, root-cause analysis, and resolution documentation
Professional experience & qualifications of a successful candidate:
- Bachelor’s degree in computer science, Informatics, or a related quantitative field
- 2-3+ years of relevant experience in DevOps, Site Reliability Engineering (SRE), or similar roles for mid-level candidates
- Strong hands-on experience with containerization and orchestration tools, including Docker, Kubernetes, Helm, and Docker Compose
- Proven expertise in infrastructure automation using Terraform (HCL) and related DevOps tooling
- Solid understanding of Linux system administration, including performance tuning and troubleshooting
- Familiarity with cloud platforms such as AWS or GCP. Experience with Microsoft Azure is a strong plus
- Experience with observability tools, and a strong understanding of monitoring, logging, and alerting best practices
- Exposure to distributed computing frameworks (e.g., Ray) is a plus.
- Good understanding of IT security best practices and software development quality assurance processes
- Familiarity with technologies such as Python, React, SQL databases, GitHub, and web servers is desirable
- Fluent in both Greek and English
Core competencies of successful candidate:
- Analytical mindset with structured problem-solving skills
- High sense of ownership and commitment to system reliability
- Collaborative and proactive team player
- Clear communicator, able to simplify complex technical topics
- Eager to learn and grow within a high-performance engineering culture
Apply Now
Let's Meet
