Random Test Failures in Jenkins

I work on Python – Django websites. I use GIT as my VCS. For continuous Integration, I use Jenkins CI. I have set up two virtual environments using Python, one for development and other for pre-production.

My issue:
I have many unit tests, regression and smoke tests written for the website. both my development and pre-production virtualenvs are connected to the Jenkins CI.

Recently, tests are failing randomly for both the environments in Jenkins CI whenever the changes in code are pushed to them. Sometimes, tests are failing randomly without any code changes been pushed forward.

Troubleshooting done:

  • Ran the tests locally, they are passing.
  • Did some builds manually in Jenkins CI (using the Build Now button) the tests are passing.
  • Ran the failing tests individually, still they are passing.

The tests that failed in the earlier builds passed in the next builds. And some tests that passed in the earlier builds, failed in the next builds. Can someone suggest what I can do?

  • Selenium scripts no working in Selenium Server and Selenium Html Runner
  • Git - What is “Refspec”
  • Jenkins Slave Environment Variable Refresh
  • Jenkins and cFix unit testing (C++)
  • Disable Jenkins Job from Another Job
  • ERROR: Workspace has a .git repository, but it appears to be corrupt
  • Consolidate Jenkins Email Notifications
  • How to call git commands on private repo using cygwin to execute bash script in Jenkins
  • One Solution collect form web for “Random Test Failures in Jenkins”

    You are going to have to identify an environmental factor that causes the tests to fail randomly.

    Some things I have seen cause this:

    • Memory – there are other things running on the CI machine and it
      doesn’t have enough memory to do all of them and build your stuff
    • Time – There is something in your code that fails depending on the
      time. For example, I had code that would fail on Feb 29th. It
      surprised us after succeeding may times. It could be something like a
      failure to format the number of seconds if there was only one digit.
    • External dependencies – Your tests depend on some other server to be
      up. If it goes down or gets really busy, it won’t respond to your
      test code and the test fails. This could be a database server.
    • Database content – You might not have set all the preconditions correctly for the test that runs against the database
    • Concurrency – Sometimes multi-threaded code will only fail when conditions are just right (or just wrong). A
      little random delay introduced by outside factors could make the code
      work or make it fail. Its easy to overlook race conditions in
      multi-threaded code.
    • Servers (or CPUs) – Sometimes a test will fail
      if it runs on a particular server or core in among the test machines.
      Of course if you only have one test machine, this can’t happen. But
      if one machine has something broken, poor connectivity (firewall
      rules), other processes running, less (or more) memory, your tests
      could fail when they are randomly assigned to run on that one.
    • [Insert yours here] – And there are a million more.

    These are hard problems to solve. Especially if they go away for no good reason. It makes you nervous because you suspect it will come back just when you are in a big hurry to fix a nasty bug in the production system.

    Git Baby is a git and github fan, let's start git clone.