Archive for the ‘General’ Category

Launchpad Workshop at UDS-R

Tuesday, October 23rd, 2012

After the success of the Launchpad clinics at the last UDS-Q we’ve decided to run some more! This time removed the sterile name of clinic and called them workshops.

If you want to get involved, scratch that itch, learn how to fix that irksome bug that has been bugging you’re not alone. Everyone probably has at least one that they’d like to see fixed.  The problem is now knowing how to fix them or maybe they don’t know how to set up the Launchpad development  environment, well lucky for you we have a lot of Launchpad developers at UDS-R and we’d like to help you help get bugs fixed!

The idea being if you have a bug you would like to fix, or pointed in the right direction  that we’ll be there to help you get on the road  to offer advice on every step of the Launchpad development process from Lines of code, to branch reviews to getting things done. We’ll have EC2 instances ready for you to develop on, so if you haven’t already gone through the process of setting up local Launchpad development on your machine, you don’t need to worry.

I have created a wiki page on which you should register if you’re going to be attending either of the clinics. Just list your name and the ID of the bug(s) you want to work on on that page. We’ll check the bugs out and get in touch with you if we think they’re too big to work on in the clinics – in which case we’ll try and work with you to get them fixed over a longer period. We’ve added the event to summit schedule, for Tuesday and Thursday of UDS so why not sign up and come along!

If you’ve never contributed before, Graham Binns has written a useful guide to contributing to  Launchpad.  He has also done up a screencast on fixing a bug in Launchpad.

Burning down critical bugs

Tuesday, October 2nd, 2012

I have been analysing Launchpad’s critical bugs to track the Purple squad’s progress while on Launchpad maintenance duty. In January of 2011, the Cloud Engineering team né Launchpad Engineering team was reorganised into squads, where one or more squads would maintain Launchpad while other squads work on features. This change also aligned with a new found effort to enforce the zero-oops policy. The two maintenance squads had more than 332 critical bugs to close before we could consider adding features that the stakeholders and community wanted. By July 2011, the count dropped to its lowest point, 250 known critical bugs. Why did the count stop falling for fifteen months? Why is the count falling again?

Charting and analysing critical bugs

Chart of Launchpad's critical bugs since the formation of Launchpad squads and maintenance duties
The chart above needs some explanation to understand what is happening in Launchpad’s critical bugs over time. (You may want to open the image in a separate window to see everything in detail.) Each iteration is one week. The backlog represent the open critical bugs in launchpad at the start of the iteration. The future bugs are either bugs that are not discovered, not introduced, or reported and fixed within the iteration. The last group is crucial to understand the lines plotting the number of bugs fixed and added during the iteration. We strive to close critical bugs immediately. Most critical bugs are reported and fixed in a few days, so most bugs were not open long enough to be show up in the backlog. The number of bugs fixed must exceed the number added to make the backlog count fall. You can see that the maintenance squads have always been burning down the critical bugs, but if you are just watching the number of open bugs in Launchpad, you get the sense that the squads are running to just stand still.

I use the lp-milestone YUI widget to chart the bugs and analyse the our progress through the critical bugs. It allows me to summarise a set of bugs, or analyse a subset by bug tag.

Launchpad maintenance analysis -- driving critical bugs to zero

Though 22 bugs were fixed this past week, 14 were added, thus the critical count dropped by 8. The last eight iterations are used to calculate the average bugs closed and open per iteration. The relative velocity (velocity – flux) is used to estimate the remaining number of days to drive the count to zero. When the Purple squad started maintenance on September 10th of 2012, the estimated days of effort was more than 1,200. In just three weeks, the number has fallen dramatically. The principle reason the backlog of critical bugs has fallen is that the Purple squad is now giving those bugs their full attention, but that generalisation is unsatisfactory.

Why is the Purple squad so good at closing bugs in the critical backlog?

I do not know the answer to my question. The critical backlog reached its all-time low of 250 bugs with the release of the Purple squad’s maintenance work in July 2011. There was supposition that  Purple fixed the easy bugs, or that the fixes did not address the root cause, so another critical bug was opened. I disagree. The squad had no trouble finding easy bugs, and it too would have been fixing secondary bugs if the first fix was incomplete. I can tell you how the squad works on critical bugs, but not why it is successful.

I was surprised to see the Purple squad were still the top critical bug closers when it returned to maintenance after 15 months of feature work. How could that be?  The squad fixed a lot of old timeout and JavaScript bugs in the last few months through systemic changes — enough to significantly affect the statistics. About 600 critical bugs were closed while Purple squad were on feature work. The squad closed 210 of those bugs. 60 were regressions that were fixed within the iteration, so they never showed up in the backlog. 70 critical bugs were fixed because they blocked the feature, and 80 critical bugs were because Purple was the only squad awake when the issue was reported. The 4 other squads fixed an average of 98 bugs each when they were on maintenance. The Purple squad fixed more bugs then maintenance squads on average even when they were not officially doing maintenance work.  The data, charts, and analysis always includes the Purple squad.

I suspect the Purple squad has more familiarity with bugs in the critical backlog. They never stopped reading the critical bugs when they were on feature work. They saw opportunities to fix critical bugs while solving feature problems. I know some of the squad members are subscribed to all critical bugs and re-read them often. They triage and re-triage Launchpad bugs. This familiarity means that many bugs are ready to code — they know where the problem is and how to fix it before the work is assigned to them. They fixed many bugs in less than a day, often doing exactly what was suggested in the bug comments.

During the first week of their return to maintenance, about 30 critical bugs were discovered to be dupes of other bugs. Though this change does make the backlog count fall, it also revised all the data, so the chart is not showing these 30 bugs as at all now. The decline of backlog bugs does not include dupes. While the squad was familiar enough to find many bugs that they close in a single day, they were not so familiar as to have known that there were 30 duplicate bugs in the backlog when they started.

Most squad have only one person with DB access, but the Purple squad is blessed with 3 people who can test queries against production-level data. This could be a significant factor. It is nigh impossible to fix a timeout bug without proper database testing. Only 13 of the recent bugs closed were timeouts though. The access also helps plan proper fixes for other bugs as well, so maybe 20% of the fixed bugs can be attributed to database access.

Maybe the Purple squad are better maintenance engineers than other squads who work on maintenance. For 28 months, I was the leading bug closer working on Launchpad. I closed 3 times more bugs than the average Launchpad engineer. I am not a great engineer though. My “winning” streak came to a closed shortly after William Grant started working on Launchpad full time; he soundly trounced me over several months. Then he and I were put on the same squad and asked to fix critical bugs. Purple also had Jon Sackett, who was closing almost 2 times the number bugs than the other engineers. I don’t think I need to be humble on this matter. To use the vulgar, we rocked! Ian was the odd man on the Purple squad. He was the slowest bug closer, often going beyond our intended scope to fix an issue. Then Purple switched to feature work…Ian lept to the first rank while the rest of the squad struggled. Ian fixed almost double the number of Disclosure bugs than other squad members. The leading critical bug closer on the squad at the moment though is Steve Kowalik. This is his first time working on maintenance. His productivity has jumped since transitioning to maintenance.

I can only speculate as to why some engineers are better at maintenance, or can just close more bugs than others. A maintenance engineer must be familiar with the code and the rules that guide it. Feature engineers need to analyse issues and create new rules to guide code. I did not gradually become a leading bug closer, it happen in a single day when I realised while solving one issue that the code I was looking at was flawed, it certainly was causing a bug, I knew how to fix it, and with a few extra hours of extra effort, I could close two bugs in a single day. Closing bugs has always been easy since that moment.

I believe the Purple squad values certainty over severity and small scope over large scope when choosing which critical backlog bugs to fix. I created several charts that break the critical bugs into smaller categories. I suggested the squad burn down sub-categories of bugs like regressions, or 404s. The squad members are instead fixing bugs from the entire backlog. They are choosing bugs that they are certain they can fix in a few hours.  I think the squad has tacitly agreed to fix bugs that are less than a day of effort. When this group is exhausted, they will fix issues that require days of effort, but also fix as many bugs. The last bugs to be fixed will be those that require many days to fix a single bug. Fixing the bugs with the highest certainty reduced our churn through the critical bugs, there are fewer to triage, to dupe, to get ready to code.

The Purple squad avoids doing feature-level design and effort to fix critical bugs. Feature-level efforts entail more risk, more planning, and much more time. There is often no guarantee, low certainty, that a feature will fix the issue. A faster change with higher certainty can fix the issue, but leaves cruft in the code that the engineers do not like. Choosing to do feature-level fixes when a more certain fix is available indicates there is tension between the Launchpad users who have a “critical” issue that stops them from using Launchpad, and the engineers who have a “high” issue maintaining mediocre code. I contend it is easier to do feature-level work when you are not interrupted with maintenance issues. When the Purple squad does choose to do feature-level work to fix a critical, they have a list of the bugs they expect to fix, and they cut scope when fixing a single bug delays the fix of the others. The Launchpad Answers email subsystem was re-written when other options were not viable, there we about 20 leading timeouts represented by 5 specific bugs to justify 10 days of effort to fix them.

The Purple squad is not unique

Nothing that I have written explains why the Purple squad are better are closing critical bugs. All squads have roughly the same skills and make decisions like Purple. Maybe the issue is just a matter of degree. If the maintenance squad is not closing enough bugs to burn down the backlog, their time is consumed by triaging and duping new critical bug reports. Familiarity with Launchpad’s 1000’s of bugs is an advantage when triaging bugs and getting a bug ready to code. Being able to test queries yourself on a production-level database takes hours or days off the time needed to fix an issue. Familiarity with the code and the reasoning that guided it increases the certainty of success. The only domain that Purple is not comfortable working with is lp.translations; the squad is comfortable changing 90% of Launchpad’s code. There may be correlation between familiarity with code, and the facts that the squad members participated in the apocalypse that  re-organised the code base, and that some have a LoC credit count in the 1000’s.

Launchpad JavaScript now combo loaded and faster than ever.

Tuesday, September 25th, 2012
Network graph of the combo loaded JavaScript.

Updated network graph

Back in January a side project was started to update the JavaScript used in Launchpad. Launchpad has been using YUI 3.3.0 for a long time, very successfully, however recent advances in YUI 3.5 and higher have added some great tools for development that Launchpad can take advantage of. In order to facilitate easier upgrades our YUI library version Launchpad has been moved to using a combo loader for serving out JavaScript.

This means, that instead of a single launchpad.js file that can be upwards of 3MB in size, each request builds a list of JavaScript modules needed for the current page to work, and the combo loader only sends down those modules. This drastically cuts down on the download size of the JavaScript for users. These combo loaded JavaScript files are also cached for speedy serving to other users of Launchpad.

The combo loader also allows us to specify which YUI version to load via a tweak to the url. In this way we can easily test new version of YUI side by side with the current stable version as they come out. This allows Launchpad to keep with future YUI released much faster.

We’re excited that today Launchpad has moved from YUI 3.3.0 to 3.5.1 and is now served by the combo loader. This change provides a faster experience for users along with easier maintenance and new JavaScript library features for developers.

We’ve still got more to do though. YUI just released version 3.7 and we aim to push that into production faster than ever before. Please let us know how these changes work for you.

Launchpad also wants to thank the folks over at YUI for continuing the great work on a tool that Launchpad heavily depends on.

Privacy for blueprints enabled for beta testers

Monday, September 17th, 2012

To go along with recent work to enable information sharing for bugs and branches, we are now enabling privacy for blueprints for beta testers. This means that blueprints now support some of the different information types that bugs and branches also support. For projects with a commercial subscription on Launchpad, this means blueprints can now be set to proprietary or embargoed. Project owners can also manage sharing for blueprints from their project’s sharing details page. For more on how sharing itself works, see Curtis’ blog post that announced that Information sharing is now in beta for everyone.

We have some minor fit-n-finish issues to complete, like nicer UI elements, and of those, we have one last known bug in progress — we know that blueprints don’t currently honor the sharing policy default when new blueprints are created. However, we thought it was worth getting this work to beta testers now to start getting feedback on this as we turn to finishing off the privacy work that is left to do.

Enjoy privacy for blueprints, beta testers! And please file bugs on any issues you find.

Information sharing is now in beta for everyone

Tuesday, August 28th, 2012

Launchpad’s bug and branch privacy features are being replaced by information sharing that permits project maintainers to share kinds of information with people at the project level. No one needs to manage bug and branch subscriptions to ensure trusted users have access to confidential information.

Maintainers can share and unshare their project with people

Project maintainers and drivers can see the “Sharing” link on their project’s front page. The page lists every user and team that the project shares with. During the transition period of the beta, you might see many users with “Some” access to “Private Security” or “Private” user information. They have this access because they are subscribed to bugs and branches. Maintainers can unshare with users who do not need access to any confidential information, or just unshare a bug or branch with a user. Maintainers can share share with a team to give them full access to one or more kinds of confidential information.

I have prepared a video that demonstrates the features (my apologies for the flickering)

Commercial projects can set bug and branch policies

Projects with commercial subscriptions can also change bug and branch sharing policies to set the default information type of a bug or branch, and control what types they may be changed to. Maintainers can set policies that ensure that bugs and branches are proprietary, and only proprietary, to ensure confidential information is never disclosed.

Sharing can be managed using API scripts

I maintain many project which have a lot of private bugs and branches. The sharing page lists a lot of people, too many to read quickly. I know most work for my organisation, but I don’t even know everyone in my organisation. So I wrote a Launchpad API script that can be run by any project maintainer to share the project with a team, then unshare with the team members. The members still have access to the bugs and branches and their subscriptions still work, but they will lose access to my project when they leave the team. This arrangement makes it very easy to manage who has access to my projects. share-projects-with-team.py is run with the name of the team and a list of projects to share with it.

./share-projects-with-team.py my-team project1 project2

New fastdowntime schedule

Tuesday, August 14th, 2012

For the last year, Launchpad has been doing schema patches using a process we call ‘FDT’, short for Fast Down Time. We have applied 60 such patches, typically taking between 60 and 90 seconds each time, at 1000UTC, our scheduled daily 5 minute downtime window for DB patching.

Recently, we eliminated Slony from our environment, which has dropped the overhead of schema patches to ~6 seconds, and this gives us <10 second downtimes to apply schema patches. We’re taking advantage of this to add two new downtime windows at 0200 UTC and 1800 UTC. All three windows will be for 10 seconds. Hopefully you will never notice that we’re doing schema patches. But if Launchpad is offline for a few seconds at one of these times, you’ll know why – we’re busy rolling out a schema change to bring a new feature to life.

Project maintainers can see private bugs

Monday, July 23rd, 2012

Project maintainers can now see all the private bugs in their project. While Launchpad tried to ensure the proper people could see private bugs in the past, the old subscription mechanism was brittle. Users could unsubscribe themselves and lose access, or retarget a bug to another projects which does not update bug subscriptions. The Purple squad migrated project configurations to project sharing so that all private information was shared with project maintainers. Project sharing ensures that confidential information is disclosed to the proper people.

If you are a project maintainer, you might be surprised to find old private bugs that you have never seen before. This happened to me. Some ancient private bugs were in the “New” listing of bugs, other were buried in search results. You can search for just private bugs to review all private bugs.

advanced search for private information types

Privacy terminology is restored

We reverted the information type terminology changes introduced a few months ago.

  • User data ➙ Private
  • Embargoed Security ➙ Private Security
  • Unembargoed Security ➙ Public Security

While the jargon-laden terms helped the small number of people who work with confidential information, the people who report bugs were confused. The most common reason for unwanted disclosure is that people enter confidential information, and cannot see how to make it private. Sometimes a user may not notice the mistake until a few minutes later. We also revised the descriptions of the information types to help new users quickly select the correct information type.

change information type

You can hide your bug and question comments

Monday, July 23rd, 2012

You can now hide your own bug and question comments. If you want to hide a comment made in error, you can use the “Hide comment” action.

hide your comment

You can see it, and even unhide it if you choose. The project’s maintainer or the trusted people delegated to work with private information can still see your comment.

your hidden comment

This allows you, or the people the project shares private information with, to hide just the comments that contain personal information. The bug does not need to be made private if the comment can be hidden. Project maintainers can also hide comments because they contain spam or abuse.

 

Beta test: asynchronous PPA package copies

Wednesday, July 18th, 2012

The Ubuntu Foundations team has sponsored work on various improvements to Launchpad’s archive handling lately, mainly to expose various new facilities on the API where we were previously using privileged scripts.  This has involved cleaning up a substantial amount of old code along the way, and it has become possible to fix some other old bugs as spin-offs.

One of these old bugs is “Archive:+copy-packages nearly unusable due to timeouts”.  The +copy-packages page allows anyone who can upload to a PPA to instead copy packages from another PPA.  This saves effort, and in the “Copy existing binaries” mode it can save a substantial amount of build time as well.  For example, the LibreOffice packaging team uses this to deliver packages to different sets of users after they have passed various levels of testing.

Unfortunately, the very cases where this is most useful, namely large and complex packages, are also the cases where it is most likely to break.  Copying large numbers of binary packages involves large numbers of database queries and can quite easily overrun the timeout for a single request to the Launchpad web application.  Doing this for several series at once, a common case which seems reasonable, is proportionally less likely to work.  Various attempts have been made to optimise the database interactions here, but ultimately doing lots of complex synchronous work in time for a single web request is doomed to failure.

The solution to all this is to copy packages asynchronously.  For some time Launchpad has had the ability to schedule “package copy jobs” which run very shortly after the request (typically within a minute) but not immediately.  For example, the Ubuntu team uses these when copying new versions of packages from Debian unstable in cases where there are no Ubuntu-specific modifications, and when releasing proposed updates to stable releases for general use after verification.  A similar facility has been present in the code for the +copy-packages page for some time, but not exposed due to various bugs.  We believe that these bugs have been fixed now, and so we would like to start copying packages asynchronously when requested via the web UI.

We have exposed this to beta testers first.  The effect is that, if you are a beta tester when you ask for packages to be copied, you will be told something like “Requested sync of 2 packages.  Please allow some time for these to be processed.”  The processing should normally happen within a minute or two, and you will be able to see it in progress on the +packages page for the target archive.  If it succeeds, the in-progress notification will be removed and you will be able to see the changes in the target archive.  Otherwise, you will see a failure notification along these lines:

A notification of a failed copy to a PPA.

If beta-testing goes well, then we will enable this for all users, and remove the old synchronous copying code shortly afterwards; so please do report any problems you see.

If you are relying on package copies in the web UI happening immediately rather than within a few minutes, firstly, please contact us (e.g. #launchpad-dev on freenode IRC, or launchpad-users@lists.launchpad.net) as we would like to understand your requirements in more detail; secondly, you may be able to use the Archive.syncSource API method instead, which also has timeout constraints but is at least guaranteed to remain synchronous.  However, we hope that most people will not have such a requirement.

Bug reporting and search knows about privacy

Monday, July 16th, 2012

The Purple squad recently updated bug reporting and searching to understand the new privacy rules. Some of the changes were requirements to support sharing, others were opportunities we took advantage of.

Improvements to bug reporting and forms

The Purple squad updated the bug reporting UI to make it consistent with the bug pages. We choose to develop one consistent and tested UI rather than update the many kinds of widgets used in bug forms.

  • Project maintainers, drivers, and bug supervisors can report private bugs.
  • Autocomplete works with bug tags
  • The status and importance controls show their definitions.
  • Undecided is the first importance because it is the default importance.

Improvements to bug searching

Advanced bug search was updated after we discovered that recent changes made it possible fix some long standing issues with a few additional lines of code.

  • Anyone can search for Private or Embargoed Security bugs that are shared with them.
  • Autocomplete works with bug tags.

Usability and Accessibility fixes

We discovered that the popups that show bug status, importance and information type did not work with keyboards. It was possible to tab out of every other kind of popup by accident. We made deep fixes to the code so that all launchpad popups work with keyboard.

  • You can use the tab key to move between the items in popups.
  • You cannot accidentally tab out of any popup.