An Introduction To CI

Introduction

Most of the projects I’ve been working on these days have been making use of the CI/CD tools provided by GitLab. In the past I worked indirectly with Circle Ci, but building pipelines myself is something I am only starting to practice now. This post should be quite helpful if you’re getting to grips with CI tools for the first time too.

What is CI ?

CI stands for Continuous Integration and is a development practice which revolves around having your team members merge their work multiple times per day.

Why would you want to do this?

Imagine you are working in a team with a number of other developers, and you are all building new features for your product. Each of you pick a different feature to work on and spend the rest of the week completing the task you assigned yourselves.

A few days pass, perhaps even a week , and you complete your new feature (congratulations). Now, you return to the rest of your team and try to merge your work with the rest of the code base…

It’s a disaster.

While you were away, the rest of the team has been adding features to the project, completing bug fixes and refactoring code. So when you tried to merge after a long period of time, you were confronted by a mountain of conflicts.

Some of the problems you find include:

  • Lines have been moved
  • Other lines have been added or deleted
  • Code has been written that changes part of how yours should work/interface with other parts of the project.
  • Dependencies have changed/updated, but you are somehow version locked because of tools you have used.

So, you get back to work, and work through the huge number of merge conflicts. What happens while you do this? The code-base changes again as your team keep working. Rather than continuous integration, your team is experiencing continuous merge conflict.

On top of that, because the code you are having trouble merging with is now quite old, it’s harder for your team mates to help you, the purpose and structure of whatever they have written is less fresh in their minds. The longer since your last merge, the less helpful your colleagues are likely to be.

Eventually, after wasting a huge amount of development time, solving all the merge conflicts, your feature gets introduced. However, even though you merged successfully, it turns out that the code you wrote doesn’t lock in properly with the code-base in its current form, and your feature has introduced a number of bugs into the programme.

Perhaps waiting so long to merge wasn’t such a good idea…

Things Could Have Been Different

Smaller Merges, Fewer Conflicts

The first advantage adopting a habit of merging your code with the code base’s more frequently would be that the number of conflicts you have with the master branch will likely be much smaller if less time has elapsed since the last time you merged with it, and if the amount of code you are trying to introduce is reduced.

This means that you will be able to fix all the conflicts in less time, decreasing the chance that you will have to face up to the challenge of continuous merge conflict we described earlier.

Additionally, the conflicting code would have aged less, so you will be able to go to colleagues and ask them about changes that have been implemented since you last merged easily (or at least you will be better placed to understand what changes have taken place, and what they mean for you)

The Need For Automated Testing and Checks

So this methodology has the potential to spare us a lot of problems, but merging frequently will create a new problem: if you are merging often, you need to make sure that what you are merging is OK.

This would be annoying, wasteful and prone to error if your team were the ones in charge of constantly doing this. Instead, CI requires you to use automated verification, to make sure that to a certain acceptable extent, the code you pushed is safe to merge with the code base.

There are two ways this is done:

  1. Run a check to see if your over all project still builds when you change is implemented and merged. If it doesn’t clearly you shouldn’t merge it until you resolve some issues. Software that assists you with CI will run this check for you and let you know if you’ve made a mess of things.
  2. Run your test suite. by seeing if the unit tests your team composed all pass, you can check, with some confidence, that your work will be safe to merge with the master branch.

On top of these two ways, you can also get your CI suite to check stuff like styling, and code conventions you set in your team. As well as getting the pipeline to perform odd jobs (such as resizing image files).

It’s important to realise that having automated testing doesn’t absolve you from having to test yourself. There are things that computers are really not well suited to automating: they might be able to check the CSS of elements on the DOM, but how do those elements actually look?

Automated verification isn’t a replacement for human testing, but can certainly speed things up. It really shines is when it comes to checking your code is able to merge properly with the master branch, catching small, or systematic errors a human may have missed, since computers are much more diligent than humans.

Why Wouldn’t You Practice CI?

Why doesn’t everyone use CI as a methodology though? If it solves such huge problems, surely we should all be using it?

The first reason that teams choose not to adopt CI as a practice is because, although it can make your team more productive, in the short term it is not zero cost in terms of effort, and for some projects, in terms of money too (for infrastructure, or on cloud services).

For CI to work, the development team often has to completely change their culture,as well as their workflow, they have to automate the bulk of their testing, and either install new infrastructure, or develop a dependency on cloud services.

The technical parts of embracing CI are much easier than making this organisational and cultural shift. The members of your team are going to have to change or develop some habits for CI to work, for instance, they will need to not only remember to check their code against the code base , but to also pull from the master branch several times a day.

Most drawbacks of adopting CI largely come from the fact that developers are human, and it’s difficult to practice this methodology perfectly. It’s one thing to use software to alert you to the fact that a build fails. It’s another to get someone to stop doing whatever they were doing, and divert their attention to fixing it immediately.

The ideal of CI is for broken builds to be dealt with on the spot, however in reality there is often some delay. Because of reasons like these, while CI is increasingly being adopted by development teams everywhere, it does have its detractors, and it might not be the right fit for every single situation.

However, in my opinion, overall you’re likely to be better off implementing at least some parts of the process into your work flow (such as having some automated testing, and having information about whether or not a merged branch can build).

Conclusion

In this article, we’ve gone through:

  • What CI is
  • The problems CI addresses
  • How CI avoids extreme problems with merge conflicts and always gives you a build you can test/work with.
  • How, even though CI seems to have a lot in common with the number 42 , it’s not a perfect solution, or the answer to life the universe and everything- in part because it can be difficult to implement.

I hope you found this article useful. Good luck if you’re learning more about Ops and CI/CD tools.