Crash investigation in Firefox Nightly

One of the areas we have been focusing on lately is getting crashes on file for Nightly. The Platform team created Project Uptime to specifically focus on improving our crash situation on Desktop and Mobile. I immediately was interested in being a part of Project Uptime since I have always enjoyed investigating crashes.

The Uptime effort helps the overall Nightly product development for several reasons:

Identifies issues early to make the product better
Allows us to quickly identify regression ranges
Helps catch crashes in Nightly before they might move to Aurora
Keeps Nightly stable so we can retain users

My Typical Workflow

Once a week I look at the crashes in Crash Stats. We have a set of queries we reference that help make it easier to drill down by Platform as well as by type of crash.

Where it starts

If I see a new signature that doesn’t have a bug associated with it, I begin looking at the crashes in that signature. On Uptime we are usually looking at a crash volume on Nightly of 10 times a day or more, across multiple installations. On platforms such as Mac and Linux where we have less crash volume, I might file a bug that has less volume than that of a Windows crash.

An Example

Bug 1284051 is a good example of a bug I caught recently that was a small volume regression. The link to all the crashes showed me in what build the crash started. In this case the crash started with Build ID 20160630030207 and then continued in the Nightly builds for several days.

I start the process by looking at the set of crash reports and determining if they are from a set of users or from an individual user. In the screenshot below you can see that the install times are all different for this particular crash:

Note that there are sometimes crash spikes with a particular signature, but they all come from a single installation and turn out to be duplicates. In the typical Uptime workflow we ignore these, because it’s difficult to tell if it’s a actually a real problem or a specific problem with the user’s machine or installation.

I then look at an individual crash report and determine what stack we may have crashed in. The “Source”column in the attached screenshot may give me some clues about who last worked in that area of code.

In this particular example I was able to find “nsilva” had worked in that area by clicking on the second link, so I added a “needinfo” on him in the Bugzilla bug to see if he could help me figure out what might be causing the crash. One thing to remember is that although the Source column can be useful, you really need to expand the “Show Other Threads” to see the full picture (and you may need the help of a developer to untangle what is going on – I often do). You can also reference the Module Owners list if you are not sure who oversees a particular area of code. Lots of times you may not know where to bucket the crash – and when this happens don’t be afraid to ask for help (IRC is your best friend).

All of the information I find that is relevant goes into the bug report. Other items I may add include:

Module or addon correlations
Crash URLs, if there is trend
Added comments, if there is a trend
Uptime range if it is a startup crash

Other helpful items to include in the bug report (from the Uptime page):

From about:support: The Nightly build id, which looks like “20160506052823”
Number of crashes that have occurred with this signature. I usually paste a link which includes all the crashes across branches, or on a particular branch if it is branch specific.
Crash rank (e.g. “this is the #1 top crash for Nightly 20160506052823”)
Number of installations
For new signatures, which Nightly build the crash started in. This page explains an easy way to search for this in Crash Stats.
When possible, a regression window for that Nightly. Adding “regression” and “regressionwindow-wanted” to the keyword field will also help trigger someone to help.
When possible, an indication of which bug’s patches may have caused the crash.
To get the attention of Release Drivers, nominate the bug by setting the appropriate tracking flag to ‘?‘ in the bug report.

This bug had a happy ending as nsilva was able to address the crash with a null check.

Platform specific crashes are another interesting area I like to explore. Since I run the Mac developer builds, I look at crashes specific to the next Mac OS version daily. It is possible in Crash Stats to construct a query which will return only the crashes that are present on the most recent Sierra beta. This is useful to be able to see and address early issues while the OS is under development. (You can also help the effort by joining the Apple beta program and running Sierra with Firefox installed – more users helps us identify issues more quickly 🙂 )

In summary, crash investigation is a fascinating part of browser development. It has many unique challenges, and often investigating a crash can be a bit like solving a puzzle.
For example, crashes with different causes can get lumped under the same signature, making it difficult to separate out all the different issues. We have challenges with third party DLLs, plugins and addons, and malware – sometimes just finding the right contact within an organization can be tricky. There may be slightly different crash signatures for Windows, Mac, and Linux and we have to account for that when reviewing the data. At the end of the day though, for me there is great satisfaction in filing a crash and watching it progress toward a fix.

Hi there,

I strongly prefer the idea of using Firefox over other browsers for ethical reasons, however on Linux, for years now, across multiple distros and multiple hardware platforms I’ve had issues with FIrefox’s memory ballooning out of control, even in safe mode, until it ends up slowing down the computer and finally crashing when Firefox gets to the point of eating between 1.5-2.5 gig of ram (I have 16 on my machine).

I’ve tried the usual solutions posted in the KB article on pinpointing memory issues, and using various extensions that aim to pinpoint where the memory usage is happening. The trouble is that things like about:memory-addons show FIrefox only using a fraction of the memory that the system shows it using, while the reports from about:memory are largely indecipherable to me, even as a fairly technical user (I triage bugs and compile other software regularly).

If I run in safe mode, the memory issue balloons outwards more slowly, but it still happens.

Strangely, the only install where I *haven’t* seen this happen, is on a virtual machine running Debian 8 (what I’m using to post this comment, now). Unfortunately, I need to use other distros for other parts of my work, so this isn’t a real solution.

I’m not asking you to fix this problem here on your blog comments, but I am asking what you feel the best course of action would be to help dev’s pinpoint the issue, as it’s meant that despite strongly preferring Firefox for idealogical reasons, that I’ve had to start using Chromium instead, to restore my system stability. This makes me, a proud Firefox supporter, into a sad, sad bunny.

Any suggestions?

8 comments on “Crash investigation in Firefox Nightly”

Marc wrote on July 19, 2016 at 5:30 am:

I appreciate the developers who are working with and for Firefox that are making it possible for me to continue to utilize my communications with my older Mac G5!!!!

Thank you.

P.S. Any updates for the playing of youtube video on this browser as mine is always breaking up.

Again thanks.

1. alex_mayorga wrote on July 20, 2016 at 5:06 pm:
  
  ¡Hola Marc!
  
  Please check https://support.mozilla.org/en-US/products/firefox and ask a question there if need be. Our SuMo community would gladly help.
  
  ¡Gracias!
  
Vickie Peters wrote on July 21, 2016 at 1:35 pm:

Hi Marcia.

I would love to help Firefox. I gave a lecture about two weeks ago and I commended Firefox on its methodology to help provide a safe, stable and consistent web search engine/browser for the internet community. It is a great humanitarian initiative.
Vickie

agus wrote on July 22, 2016 at 3:37 am:

i appreciate the developers who are working with and for Firefox that are making it possible for me to continue to utilize my communications with my older Mac G5!!!!

Thank you.

PHAM THANH TUYEN wrote on July 22, 2016 at 4:21 am:

Dear Marcia,
I have built firefox-47.0.1 on a x86_64 Linux with the system curl is curl 7.31.0. Then I install curl-7.49.1. Firefox then crashed in the Web sites of .vn domain. I installed back curl-7.31.0, the situation is better, but sometimes still crashed. Cairo package have no problems? I did not use system cairo, but internal cairo of Firefox. Thanks!
Tuyen

Cal wrote on July 23, 2016 at 10:17 pm:

Although my system is old I expect it to continue to do things well that it has historically done well. FF is definitely falling on its face on 32 bit systems with javascript and video playbadk. If it was working before why doesn’t it work now?

These “outlier” cases matter because they show where the architecture team is making mistakes.

Keep some old computers around (like your core user in Africa, Asia and similar parts of the world) and actually use them.

Cheers, keep up the otherwise good work.

FireMouse wrote on July 24, 2016 at 9:46 am:

Hi there,

I strongly prefer the idea of using Firefox over other browsers for ethical reasons, however on Linux, for years now, across multiple distros and multiple hardware platforms I’ve had issues with FIrefox’s memory ballooning out of control, even in safe mode, until it ends up slowing down the computer and finally crashing when Firefox gets to the point of eating between 1.5-2.5 gig of ram (I have 16 on my machine).

I’ve tried the usual solutions posted in the KB article on pinpointing memory issues, and using various extensions that aim to pinpoint where the memory usage is happening. The trouble is that things like about:memory-addons show FIrefox only using a fraction of the memory that the system shows it using, while the reports from about:memory are largely indecipherable to me, even as a fairly technical user (I triage bugs and compile other software regularly).

If I run in safe mode, the memory issue balloons outwards more slowly, but it still happens.

Strangely, the only install where I *haven’t* seen this happen, is on a virtual machine running Debian 8 (what I’m using to post this comment, now). Unfortunately, I need to use other distros for other parts of my work, so this isn’t a real solution.

I’m not asking you to fix this problem here on your blog comments, but I am asking what you feel the best course of action would be to help dev’s pinpoint the issue, as it’s meant that despite strongly preferring Firefox for idealogical reasons, that I’ve had to start using Chromium instead, to restore my system stability. This makes me, a proud Firefox supporter, into a sad, sad bunny.

Any suggestions?

Wellington Torrejais da Silva wrote on July 28, 2016 at 10:04 pm:

Nice job…

Firefox Nightly News

Crash investigation in Firefox Nightly

My Typical Workflow

Where it starts

An Example

8 comments on “Crash investigation in Firefox Nightly”

Marc wrote on July 19, 2016 at 5:30 am:

alex_mayorga wrote on July 20, 2016 at 5:06 pm:

Vickie Peters wrote on July 21, 2016 at 1:35 pm:

agus wrote on July 22, 2016 at 3:37 am:

PHAM THANH TUYEN wrote on July 22, 2016 at 4:21 am:

Cal wrote on July 23, 2016 at 10:17 pm:

FireMouse wrote on July 24, 2016 at 9:46 am:

Wellington Torrejais da Silva wrote on July 28, 2016 at 10:04 pm:

Leave a Reply
Cancel reply

My Typical Workflow

Where it starts

An Example

Marc wrote on July 19, 2016 at 5:30 am:

alex_mayorga wrote on July 20, 2016 at 5:06 pm:

Vickie Peters wrote on July 21, 2016 at 1:35 pm:

agus wrote on July 22, 2016 at 3:37 am:

PHAM THANH TUYEN wrote on July 22, 2016 at 4:21 am:

Cal wrote on July 23, 2016 at 10:17 pm:

FireMouse wrote on July 24, 2016 at 9:46 am:

Wellington Torrejais da Silva wrote on July 28, 2016 at 10:04 pm:

Leave a ReplyCancel reply

Leave a Reply
Cancel reply