Software Development Life Cycle (SDLC)

This document describes the processes that support the following company policies:

Software Development Life Cycle (SDLC) Policy: Annotated Notes Drata Policy

TODO: review and incorporate the relevant part of the Processes page.

General Principles

The planning, development, testing, deployment, and maintenance cycles are performed iteratively.

The first step in the SDLC process is planning. During this stage, the product and development teams work together to create a high-level plan for the project, identify key stakeholders, and establish clear project objectives.

The outcome of this step depends on the scope of the changes :

If the project is a major change to the system, the product team defines the product goals used to meet the high-level company objectives and the development team writes a design document that describes the technological choices that are considered to implement the changes. A number of milestones, epics, stories, and tasks are created in Shortcut to track the progress of the project. Additional process documentation from the product team can be found on the Project Execution Notion page.
The design document should address the System Requirements Security Checklist.
If the project is a small change that results from a production incident or support request, a set of tasks are created to track the progress of the mini project, but no additional documentation is produced to address the issue in a timely manner. These tasks are created in the Techops project using the Techops template. See Techops Issue Tracking Notion page for more information about this process.

The development cycle is chunked into regular (currently: 2-week sprints/checkpoints). At the end of each sprint / beginning of the next sprint, developers post asynchronous high-level updates (currently to the #daily-gsd slack channel) describing the milestones they are working on, what has been completed since the last update, what is in progress, and the roadblocks and issues that have been encountered. This is an opportunity to surface any problems or improvement suggestions that arise and prioritize next tasks/projects in the product backlog, look at who will be free soon, where we need to intervene to cut a project's scope, and assign out next tasks for people who are ready for them. Refer to the Sprint Planning Process page for more information on the related processes.

Development

Developers post daily updates to the #daily-gsd Slack channel to inform on the tasks currently being worked on and any blockers preventing completion of these tasks.

Git is used to manage the codebase and track changes throughout the development process. In order to facilitate linking the tasks to the actual work performed, developers use Shortcut built-in Git Support to link tasks to Git branches and pull requests.

Commit messages are written using the Conventional Commits convention.

Github Actions is used to run tests and perform continuous integration.

Each change is done using a branch that is merged (no fast forward) to the main branch, in order to preserve the history. While developing, the local branch should be rebased onto the latest main to avoid drifting for too long and causing merge hell.

We do not currently protect the main branches because of technical constraints (the release pipeline / sbt-release-plugin should be updated to support this), but we should generally act as if the main branches were protected to make the eventual transition smoother.

Testing

The nature and amount of testing is context-specific and includes, where relevant

Automated unit-tests
Automated integration tests
Automated end to end tests
Manual tests
Monitors and alerts

The development environment is used to perform manual testing prior to deployment in production.

An automated test suite runs on the CI server (Github Actions), and these tests may be executed before or after the merge of the code to the main branch, depending on the circumstances. We do not commit to running all tests before merging for productivity concerns, but the developer merging a build that eventually fails is responsible for fixing the build.

In general, testing should not be done with production data, but when there is a good reason to do so, the Information Resource Owner needs to provide approval. Approval is requested by creating a ticket in the Information Owner Resource Requests label using the Information Owner Resource Requests - Testing With Production Data story template

Deployment

Deployment procedure slightly varies from project to project, but is generally performed using Terraform and Jenkinsfiles. Deployment information (versions, config) is tracked by committing the Terraform/Jenkinsfiles to Git along with the code.

All deployment and infrastructure changes can be done thanks to a specific role that developers can assume. The deployments may be performed directly by the CI server, but may also be performed by developers themselves (by assuming the centralized role). Regardless of how these deployment are performed, any changes to the infrastructure / any assumption of the role are tracked using Cloudtrail.

Deployable software components should be listed in the service inventory and should come with operational documentation (operations.md) describing

The deployment procedure
How to operate the service/application
The monitoring and alerts that the component provides
Links to the relevant logs
Important troubleshooting / failure handling procedures

They also come with high-level architecture documents (architecture.md) describing

Background information
High-Level Logic and Data Flow Diagrams
Links to the Design Documentation
Components
Architectural Deficiencies

For major releases, the code review associated with the deployment should address the System Deployment checklists.

Code and Infrastructure Review

Code and infrastructure reviews are created manually by developers (each code review may link to one or several branches / pull requests) and may be performed before or after the merge of the code and/or the deployment of the code. The code reviews are tracked to completion, but there is no software guaranteeing that all of the pull requests are reviewed, it is a matter of practice / habit to do so.

Monitoring and Maintenance

The final step in the SDLC process is maintenance. During this stage, the team should provide ongoing support for the software, including bug fixes, updates, and enhancements.

We have set up a VictorOps rotation. The on-call developer is responsible for monitoring the following places to address any pressing issues:

VictorOps alerts
The #auto-techops Slack channel
The #auto-app-notifications` Slack channel
The #helpdesk` Slack channel

More information on our monitoring infrastructure and conventions:

narrative-network-infra/monitoring, and in particular
- notification-publisher-infra
- victorops-integration

Process Improvements Under Consideration

Ideas being currently discussed to improve our processes:

6-week appetites: Shape Up defines the rationale for such 6-week- windows.
Security: Marko and Seth are currently working on an internal vulnerability disclosure policy and a way of tracking security issues internally.
Project Execution
- Standardization of project delivery, covering everything from “we have an idea” to delivery.
- Explicit definition of what we consider best practices when delivering a project.
- Change the bot logic to track code reviews to completion.

PlaybooksSOC 2 Evidence Gathering

PlaybooksSystem Access Control