Site Reliability Engineer - Intellectica Solutions Driven

Intellectica is recruiting an experienced and highly skilled Site Reliability Engineer on behalf of a newly established AI & Machine Learning startup, which will be integrated into a leading group of companies operating in the aerospace, defense, and high-technology sector. This startup aims to develop cutting-edge AI-driven solutions tailored to the aerospace and defense industry. You will join a dynamic cross-functional team and report directly to the Principal SRE.

Key activities and responsibilities of this role include:

Implement, operate, and maintain infrastructure services across cloud-based and on-prem environments, under the guidance of Senior and Principal SREs
Provision and manage containerized applications using Docker and Kubernetes, ensuring scalable, reliable, and smooth application deployments
Automate infrastructure tasks using Terraform (HCL) and other modern DevOps tools to support repeatable and efficient operations
Support CI/CD pipelines, enabling fast, reliable, and secure software releases through continuous integration and deployment, best practices
Implement observability tools for metrics, logging, and alerting to ensure high availability and system reliability
Assist in the setup and operation of secure networking, including firewalls, access controls, and VPNs, following established security guidelines
Collaborate closely with ML Engineers and Data Scientists to understand infrastructure needs and deliver scalable solutions
Participate in code reviews, pair programming, and knowledge-sharing sessions to elevate team practices
Contribute to operational documentation, playbooks, and internal knowledge bases
Assist in incident response, root-cause analysis, and resolution documentation

Professional experience & qualifications of a successful candidate:

Bachelor’s degree in computer science, Informatics, or a related quantitative field
2-3+ years of relevant experience in DevOps, Site Reliability Engineering (SRE), or similar roles for mid-level candidates
Strong hands-on experience with containerization and orchestration tools, including Docker, Kubernetes, Helm, and Docker Compose
Proven expertise in infrastructure automation using Terraform (HCL) and related DevOps tooling
Solid understanding of Linux system administration, including performance tuning and troubleshooting
Familiarity with cloud platforms such as AWS or GCP. Experience with Microsoft Azure is a strong plus
Experience with observability tools, and a strong understanding of monitoring, logging, and alerting best practices
Exposure to distributed computing frameworks (e.g., Ray) is a plus.
Good understanding of IT security best practices and software development quality assurance processes
Familiarity with technologies such as Python, React, SQL databases, GitHub, and web servers is desirable
Fluent in both Greek and English

Core competencies of successful candidate:

Analytical mindset with structured problem-solving skills
High sense of ownership and commitment to system reliability
Collaborative and proactive team player
Clear communicator, able to simplify complex technical topics
Eager to learn and grow within a high-performance engineering culture

Apply Now

Site Reliability Engineerint-0415

Apply Now

Subscribe to our Newsletter

Privacy settings

With the slider, you can enable or disable different types of cookies:

This website will:

This website won't:

This website will:

This website won't:

This website will:

This website won't:

This website will:

This website won't:

Site Reliability Engineer
int-0415