Boeing 787s must be reset every 51 days or 'misleading data' is shown (2020)

203 points by jakey_bakey 3 days ago

smcleod 3 days ago

I was speaking with a 787 pilot last Sunday, I told him that the week before when I was at an airport there were two pilots sitting next to me talking about how "This is the third bloody 787 rescue we've had this month... I can't believe we had full engine and <I think he said auxiliary?> failure at the same time" - I asked him if this is common and he said "I hear of it, but I haven't had that many major failures, but lots of little things - last time I flew in from <city> a few moments after we touched down we lost auxiliary power from the rear engine, all the cabin lighting went black along with a number of other things, thankfully we'd already significantly reduced speed and were straight and already lost most of the speed we were carrying, so we were fine and taxied to the disembark location, they had it up and flying again within the day - but it certainly was disconcerting to say the least".

I will be slightly paraphrasing from memory there, but certainly was quite surprised how calm he was about the whole thing, there's no way I'd board one of those things.

jiggawatts 3 days ago

Modern two-engine planes like the 787 have an auxiliary power unit (APU) in the tail. This is a small turbine that runs a generator and a pump for the hydraulics. It’s typically only turned on when the plane is on the ground, or if there’s an emergency in mid-air. It is also needed to start the main engines so if the APU is faulty the plane will probably be stuck where it is. In theory a 787 can take off with just one engine but this is not very safe and wouldn’t be done in all but the most exceptional circumstances.
There are variations on this depending on the plane model, of course. Some older planes can use an external starter for their engines, but I think that’s very rare now.
- thecosmicfrog 3 days ago
  
  Aircraft with INOP APUs can generally be "air started" with a ground-based high-pressure air system. It's relatively common and I've been on a plane that had to do the procedure. It was entirely undramatic other than engines being started before the pushback, but I doubt most passengers even noticed.
  Now, interestingly, the 787 is a "bleedless" aircraft, so it doesn't use high-pressure air from the APU to spool up the engines. I believe it can use its hefty bank of lithium-ion batteries to start its engines if the APU (and associated electrical generator) is INOP.
  Not a pilot/engineer - just an enthusiast. Someone more au fait with the 787 might be able to correct me on the above.
  - hinkley 3 days ago
    
    My understanding is that there was a push to modify the U shaped tow trucks they use to position planes to have a battery powered system to start the engines.
    The idea being that the APU isn't particularly clean burning, not compared to power plant emissions. It's been a long while since I've heard anything about that plan, for or against.
    
    thecosmicfrog 3 days ago
    
    Interesting! Although it'd (presumably) only be useful for the 787, short of heavy modification to existing aircraft. Even the Airbus A350, an aircraft from the same era, uses a traditional bleed system. If planes continue down the bleedless route I can see it happening.
  - inferiorhuman 2 days ago
    
    Yeah the 787 can be started electrically but it takes a ton of juice.
    https://www.youtube.com/watch?v=1W_RtawHVvw
- mrguyorama 2 days ago
  
  >Modern two-engine planes like the 787 have an auxiliary power unit (APU)
  Where "modern" here includes jet airliners made in the 70s yes.
  >It is also needed to start the main engines
  The engines need an air source, and the APU can be an air source, but at one point at least, big airlines preferred using ground hookup provided air sources for starting, in order to save gas. Next time you fly, look at the jetway. There will be a large yellow duct system underneath it that can be hooked up to the plane to provide pneumatic pressure and air conditioning air without starting the APU. There are similar hookups for electrical power so that a plane won't drain its battery during routine turnover operations.
  The bottom price flights I've taken recently don't seem to hook either up though, preferring instead to start the APU during taxi to the gate while shutting down one engine, shut down the other engine once they are at the gate, and reverse the process to taxi back out to the runway. The turnaround time is so short, and the required work to clean and restock the cabin so little, I bet they just don't pay for ground hookups.
Filligree 3 days ago

APU failure maybe? That would be troublesome indeed; with no engines and no APU you'd lose most instrumentation and a lot of the hydraulics.
- n_ary 3 days ago
  
  There is also a RAT at the back that can be deployed to generate some power(~5-10 minutes max) in case of severe emergency in Air. It is what you hear sometimes, when the aircraft is making a very shrill noise flying over your head.
  However, if it is not a test flight, a RAT deployment should make you very uncomfortable and worried…
  - kashunstva 2 days ago
    
    > RAT … It is what you hear sometimes, when the aircraft is making a very shrill noise flying over your head.
    I’ve been around a lot of airplanes and I can’t say I’ve seen or heard a ram air turbine deployed in flight. There was a recent incident involving a Frontier Airlines flight in which the RAT was deployed when the aircraft was put in emergency electrical configuration. The deployment of the RAT would be quite rare.
  - ortusdux 3 days ago
    
    https://en.wikipedia.org/wiki/Ram_air_turbine
  - DiggyJohnson 2 days ago
    
    The chances of you being on multiple commercial flights where the Ram Air Turbine are deployed is infinitesimal, no?
    Also, RAT can power limited systems indefinitely on most models, not all or limited systems for a limited amount of time.
  - iwontberude 3 days ago
    
    I find it hard to believe that anyone reading this was within earshot of a plane in a severe emergency and heard this particular sound and since turbine engines are already quite shrill I am basically just sorta confused who your audience is for this suggestion.
    
    inferiorhuman 2 days ago
    
    The RAT makes an extremely distinctive sound. You'd recognize it nearly instantly. However the RAT will not power everything.
    https://www.youtube.com/watch?v=KzejbxNj1hY
    
    wildzzz 2 days ago
    
    That's cute, it sounds like a little Cessna
    
    rob74 2 days ago
    
    Usually, when the RAT is really deployed because of an emergency, the jet engines will be a lot more silent (because they're not producing any power). Although I'm not really sure how loud a windmilling jet engine really is, and I somehow doubt there is a YouTube video of a plane landing with both engines disabled - but you never know...
    
    happosai 2 days ago
    
    Indeed unlikely to hear RAT deployed due to emergency. But they do deploy it sometimes on test flights after maintenance.
    
    fshbbdssbbgdd 2 days ago
    
    Would you hear it from inside the plane? Even if it’s not as loud as the main engine, if it’s audible at all a lot of people would notice a change in pitch/tone. At least, I notice when the sounds the plane is making change even though I don’t know anything about the reason.
    
    jszymborski 2 days ago
    
    It's apparently quite loud
    > After starting the descent, the flight crew made an announcement to the passengers; however, unbeknownst to the flight crew, the noise generated by the RAT (because of its high rotation speed) prevented the passengers and the cabin crew from hearing the announcement.
    https://asn.flightsafety.org/wikibase/187755
    
    Filligree 2 days ago
    
    Oww. Seems they got lucky.
    It always surprised me that there aren’t small, local lithium batteries to provide backup power for critical components like the smoke detectors. Is the risk of those catching fire considered too high?
    
    jszymborski 2 days ago
    
    I know next to nothing about planes, but I think another comment here suggested some newer planes do have a large Li-ion battery banks.
    That said, a household smoke detector runs on next to no power. Obviously not the same device but surely it can operate on the same principle.
    
    mrguyorama 2 days ago
    
    >It always surprised me that there aren’t small, local lithium batteries to provide backup power for critical components like the smoke detectors
    There is, well, only lithium on the 787. If all power generation is dead, then the most critical flight instruments and gauges get about 20-30 minutes of power from the plane's batteries, things like your backup old fashioned gauges, the engine computers, and maybe some basic flight computer on newer planes. The RAT is intended to keep flight surfaces operational when everything else is utterly fucked, so it usually produces the same kind of energy as whatever the primary flight control system uses, which until recently was hydraulic power. On civilian airliners they generate tens of kilowatts. Airliners do not want to carry around an EV sized battery for the extremely rare occasions when you lose all systems, because that's a waste of gas. The RAT provides the same functionality for lower weight.
    When the RAT is deployed, you do not care much whether a smoke detector is powered, you are already vectoring towards an attempted landing.
    
    fphhotchips 2 days ago
    
    I feel like it's not the RAT you'll notice from inside the plane, it will be the silence from the engines. That combined with at least a momentary flicker of the lighting (I'm not sure if a RAT on a 787 will run cabin lighting but I doubt it), and you'll know.
    
    rolandog 3 days ago
    
    Username does not check out.
    Jokes aside... I'm certainly part of the intended audience: point me at an interesting rabbit hole, and there I gooo.
    
    iwontberude 3 days ago
    
    Haha I didn’t parse it that way but I can see how you thought that upon rereading. I just want to understand why we would hear the RAT when there wasn’t an emergency overhead. I supposed planes regularly test them?
    
    rogerrogerr 2 days ago
    
    They don’t.
    
    inferiorhuman 2 days ago
    
    I'm not going to bother slogging through everything to be able to speak in specifics for every airplane ever built, but:
    A RAT provides backup electrical and/or hydraulic power for control surfaces (and other goodies). A RAT would certainly be inspected during a heavy check and likely even during line checks (e.g. an "A" check or equivalent). How often is gonna depend on the airplane. But to suggest that a critical piece of equipment isn't checked regularly is just silly.
    Additionally, it's pretty much guaranteed that if an airplane comes with a RAT the RAT is required to be functional for ETOPS flights. That alone means you're gonna be inspecting it pretty frequently. ETOPS certification has three parts: airplane, airline, and humans. You'd want to look at the ETOPS Maintenance Document at whatever airline to be sure.
    Outside of Asia (where domestic widebody flights are still common) I'd guess many if not most 787 flights are ETOPS flights.
    
    skissane 2 days ago
    
    > Additionally, it's pretty much guaranteed that if an airplane comes with a RAT the RAT is required to be functional for ETOPS flights. That alone means you're gonna be inspecting it pretty frequently.
    I remember a decade or more ago I was on a US domestic flight - I forget exactly what, I think it was American from SFO to LAX - so I doubt it needed ETOPS. But the captain announced - while we were still at the gate - that he was getting an error in the cockpit saying the RAT was faulty. And he called maintenance, and they told him to try resetting something (a computer or circuit breaker or whatever) to see if that cleared the error - and when it didn’t, he announced we could not take off and would all have to go back into the terminal. Thankfully they had a spare plane a few gates over and they put us all on that (same crew, same passengers) so we only lost an hour or two.
    
    inferiorhuman 2 days ago
    
    Right. In the context of this discussion ETOPS buys you significantly increased inspection and maintenance requirements. That's why I don't playing this game of telephone. Someone told someone else that something else did something else. Were everything to have unfolded as transcribed here there almost certainly would've been a high profile investigation.
    Back to your flight, both the FAA and EASA require airliners to have a minimum equipment list (MEL). It's entirely unrelated to ETOPS (overwater flights). This list describes what equipment is required to be functional, what you can fly without and when. What's on the list is all going to come down to what kind of plane we're talking about. Could be you're not allowed to fly without a functional RAT ever. Could be that you can fly without a RAT as long as something else (e.g. APU) is functional. Could be you can only make a certain number of flights with a non-op RAT.
    A real world example is that ATR 72 crash in Brazil recently. One of the PACKs (air conditioning / cabin pressurization) was not functioning on the accident plane. Per the MEL you can dispatch an ATR in that condition, but you're limited to a service ceiling of 17,000 ft. Unfortunately that put the flight in direct conflict with the weather.
    
    rogerrogerr 2 days ago
    
    You’re right; my statement was in the context of the above discussion about people claiming to somewhat-regularly hear RATs in the air above them. That definitely isn’t happening.
  - mulmen 2 days ago
    
    Why would the Ran Air Turbine be time restricted? As long as the plane is moving there’s power.
    
    logifail 2 days ago
    
    > Why would the Ran Air Turbine be time restricted?
    Gravity?
    
    mulmen 2 days ago
    
    But the turbine generates power to keep the plane flying. Why would it only work for 10 minutes? Certainly the flight time is a product of fuel level and altitude. Even if both engines fail the flight time would be a function of altitude. I don’t see how deployment of the RAT informs flight time.
    
    chillel 2 days ago
    
    It does not generate power to provide thrust; it generates power - using the airstream as the aircraft moves through the air - for the avionics and/or hydraulics.
    
    mulmen 2 days ago
    
    Yes, exactly.
    
    reportgunner 2 days ago
    
    If the RAT could keep the plane flying indefinitely we would just be using RAT instead of fuel I suppose.
    
    DiggyJohnson 2 days ago
    
    /s? A generator or alternator powered directly by the engines is more efficient than towing a wind vane (still indirectly powered by the engines and/or the potential energy of the airplane) every single time.
    This discussion has nothing to do with engine out failure modes.
    
    JumpCrisscross 2 days ago
    
    > This discussion has nothing to do with engine out failure modes.
    The 787's APU is not intended to run during flight. If it's running, you're in an engines-out scenario.
    
    mulmen 2 days ago
    
    Huh? It’s a generator. It generates minimum power to keep the flight controls and instruments working. It’s not propulsion.
    
    mrguyorama 2 days ago
    
    By definition, when you are using the RAT, you don't have any electrical power and you probably don't have thrust.
    You are constrained by battery, airspeed, and altitude.
    
    mulmen 2 days ago
    
    Well you aren’t constrained by battery if the RAT is deployed, that’s the point of the RAT.
    “Probably” is doing a lot of work here, there could be a power failure without engine failure.
    
    delfinom 2 days ago
    
    The RAT is a generator, not a device that can provide thrust. If anything it will minutely slow the plane down.
- smcleod 3 days ago
  
  I thought the guy I was speaking mentioned something about instrumentation but I wasn't 100% sure and that sounded more serious so didn't mention it - but if the aux engine failing would do that - I guess that lines up!

Dylan16807 3 days ago

Previous: https://news.ycombinator.com/item?id=22761395 https://news.ycombinator.com/item?id=33233827

More interesting, a root cause analysis: https://news.ycombinator.com/item?id=33239443 https://ioactive.com/reverse-engineers-perspective-on-the-bo...

The 47 bit timestamp at 32MHz would explain the duration (Though not why it isn't 33MHz?).

dang 2 days ago

Thanks! Macroexpanded:
Reverse Engineer’s Perspective on the Boeing 787 ‘51 days’ Directive - https://news.ycombinator.com/item?id=33239443 - Oct 2022 (55 comments)
Boeing 787s must be rebooted every 51 days to prevent 'misleading data' (2020) - https://news.ycombinator.com/item?id=33233827 - Oct 2022 (140 comments)
Boeing 787s must be turned off and on every 51 days (2020) - https://news.ycombinator.com/item?id=27117320 - May 2021 (42 comments)
Boeing 787s must be turned off and on every 51 days - https://news.ycombinator.com/item?id=27111650 - May 2021 (4 comments)
Boeing 787s must be turned off and on every 51 days to prevent 'misleading data' - https://news.ycombinator.com/item?id=22761395 - April 2020 (152 comments)
gfv 2 days ago

I have a way simpler explanation. IEEE 754 double can only represent integers up to 2^53 without precision loss, so if you naively average two numbers greater than 2^52, you get an erroneous result.
It just so happens that 2^52 nanoseconds is a little bit over 52 days.
I've seen the same thing with AMD CPUs where they hang after ~1042 days which is 2^53 10-nanosecond intervals.
- ShroudedNight 13 hours ago
  
  Having done exactly this math for GStreamer bindings in JavaScript (where the built in numeric types are double or nothing), this would also be my prime suspect.
- withinboredom 2 days ago
  This is incorrect. Very incorrect and disastrously so. Drop 0.3 in here: https://www.h-schmidt.net/FloatConverter/IEEE754.html
  You can also drop 524535643, an integer clearly less than 2^53 and is off by 5.
  This is even seen here:
  #include <stdio.h> int main() { float b = 524535643.0f; printf("%f", b); return 0; }
  output: 524535648.000000
  - gfv 2 days ago
    
    I was talking specifically about double-precision floats. Single-precision floats can represent every integer up to +-2^24.
    
    withinboredom 2 days ago
    
    Ah, missed that detail.
  - DiggyJohnson 2 days ago
    
    How could their comment possibly be "incorrect and disastrously so" unless the FAA is citing this thread for their investigation?
    
    withinboredom 2 days ago
    
    The comment said IEEE754 doubles can represent integers to 2^52. But I missed the double or assumed float. Floats cannot do that and it would be disastrous to assume so. For that matter, doubles also have some pretty big issues when you do operations on them (loss of precision), but as long as you are purely doing integer operations, it “should” be fine. A practical example with non-integers: 35 + -34.99
- h_tbob 2 days ago
  
  Please email Boeing!
  Seriously they should have posted here for some help!

hggigg 3 days ago

Had a similar problem to this many years ago. Happened every 24 days approximately and lost one user setting. Had a logic analyser connected to it for days trying to reproduce the issue in some way. Went to go for a piss and get a coffee one afternoon and came back and there it was triggered!

What happened? Well it turns out there was a timer that no one used that overflowed and caused an interrupt which wasn’t handled any more, the interrupt handler fell through, caused a halt and the WDT fired fire rebooting it and some idiot hadn’t stored that one setting in the NVRAM.

So then we had more problems. 5000 things with EPROMs in that were rebooting every 24 days which were spread all over the planet. Many questions to ask over how the hell it ended up like that.

I hope people are asking these sorts of questions at Boeing.

Edit: also the source code we had did not match what was on the devices. Turned out the engineer who provided the hex file hadn’t copied that code to the file server and had left a year before hand. We didn’t find that until the WDT fired and piqued our interest and could reproduce it on the dev board because the software was different (should have checked that past the label on the ROM which was wrong!)

fnordpiglet 3 days ago

I’d note that commercial airplanes generally operate with 6-7 9’s of availability. For anyone that’s ever built a system with 5 9’s, this is impressive. In fact it’s impressive enough you probably don’t think twice about sleeping on a flight.

Aloisius 2 days ago

Six 9s would be half a minute of downtime per year.
I don't see how that is possible given the maintenance required for these planes. Even the simple A checks ground a plane for hours every couple hundred flights while D checks take months to complete every 6-10 years.
Edit: minute not hour
- benhurmarcel 2 days ago
  
  > I don't see how that is possible given the maintenance required for these planes
  You normally only count unplanned downtime in those stats for aircraft.
- csallen 2 days ago
  
  By my math, six 9's is 30 seconds, not 30 minutes?
  (1 - 0.999999) * (60 * 24 * 365)
  EDIT: This chart agrees: https://en.wikipedia.org/wiki/High_availability#Percentage_c...
  - Aloisius 2 days ago
    
    You are correct. Forgot what my units were in bc.
- echoangle 2 days ago
  
  The six 9s are probably meant as catastrophic failure rate, not downtime.
  - fnordpiglet 2 days ago
    
    It counts any event where the plane has an unplanned event including unscheduled maintenance, unplanned flight deviations, and of course catastrophic failures.
woah 3 days ago

If something goes wrong, does it matter whether you are asleep or awake?
- vkou 3 days ago
  
  Only when a flight attendant is asking on the intercom: "We don't mean to alarm anyone, but is anyone on board a pilot?" and you happen to be one.
  - hggigg 3 days ago
    
    I know a commercial pilot who used that as a joke once and got in trouble. The plane in question had several pilots on it but the rest of the passengers didn’t find it funny for obvious reasons.
  - incognito124 3 days ago
    
    It's entirely a different kind of flying
    All together
    
    TonyTrapp 2 days ago
    
    It's entirely a different kind of flying!
    
    Karellen 2 days ago
    
    It's entirely a different kind of flying
  - LeonB 3 days ago
    
    “We don’t wish to cause any alarm, but is there any one on board who is familiar with regular expressions, cron expressions and parameter expansion rules in bash?”
    
    wildzzz 2 days ago
    
    Several overweight men stand up and walk towards the cabin, immediately throwing off the weight distribution and the plane plummets.
    
    rsync 2 days ago
    
    You joke but… There was an emergency nose high recovery out of San Diego airport where at one point the pilot had every passenger crowd into the first class cabin…
    The flight was saved.
    
    meowster 2 days ago
    
    It looks like that was https://en.wikipedia.org/wiki/Delta_Air_Lines_Flight_1080
    
    yard2010 2 days ago
    
    You mean.. for free?
    
    sgarland 2 days ago
    
    I hope the first class passengers were well compensated for their traumatic experience.
  - hooverd 3 days ago
    
    Hopefully you didn't have the fish.
    
    blitzar 2 days ago
    
    Looks like I picked the wrong week to quit amphetamines
    
    mass_and_energy 2 days ago
    
    Looks like I picked the wrong week to stop sniffing glue!
- greenchair 3 days ago
  
  woosh!
Havoc 2 days ago

That is presumably historic data though?
6-7 nines is a lot of nines and we’ve had a couple of issues in quick succession now
OJFord 2 days ago

> it’s impressive enough you probably don’t think twice about sleeping on a flight.
I don't think twice about sleeping on a flight because I've already made my bed at that point - nothing I can do if something goes wrong.
(Well, I've woken my wife when a doctor was called for before, but that's about the extent of my usefulness.)
- fnordpiglet 2 days ago
  
  I’ll wager if you got into a situation you can’t escape where you had a 30% chance of a horrific death over the next six hours you wouldn’t snuggle into your sound suppressing headphones and doze off between snacks no matter how inevitable things are.
  - vikingerik 2 days ago
    
    Where in the world did you get that 30% number? Even on Boeing's worst planes, the chances of any incident are still much more like 0.003% or something like that. "30%" is just fearmongering.
  - lazide a day ago
    
    Meh, had worse odds and slept fine. Notably, however, odds are not nearly that bad, even in Aeroflot flights out of Africa. Or for that matter, combat flights in war zones.
    Besides, a plane crash is far from the worst way to go. Dramatic, sure.
    But dementia? Cancer? That’s often a pretty miserable death.
    Plenty of things out there to get worried about if you want.
lostlogin 2 days ago

> I’d note that commercial airplanes generally operate with 6-7 9’s of availability.
Maybe they used to, but Boeing has been doing rather worse and that’s the point here isn’t it?

abadpoli 2 days ago

Airbus A350s had the same issue: https://www.theregister.com/2019/07/25/a350_power_cycle_soft...

We’re just going to see more and more issues like this as more and more software is used in applications like this. I would be willing to bet that a Tesla would also spontaneously crash if left on for hundreds of hours, but they just rarely if ever are left on that long.

flutas 2 days ago

Ford F150 Lightning had a similar issue on a cross country road race some YT'ers put on. It died at 13% battery, Ford said it was due to not letting the truck rest.

shadowgovt 3 days ago

This is remarkably business-as-usual for airplane electronics.

As a more mundane example: the wifi on planes does temporary [edit: DHCP, not NAT] leases. But the system on many has expiration windows on the order of hours, possibly more than a day... Couple that with the number of passengers planes serve and busy routes can easily exhaust the lease pool.

The solution: there's a button the flight attendants can push to reboot the router, dumping the lease table.

JosephRedfern 3 days ago

Nitpicking here, but you mean DHCP rather than NAT, right?
- shadowgovt 3 days ago
  
  Yes; thank you.
Matheus28 3 days ago

Even with super long leases, couldn’t they just have a larger subnet? A /8 oughta do it.
But I guess we’re talking about the same people who made the mistake in the first place…
- jmholla 3 days ago
  
  To steelman the choice, the reserved IP /8 subnet is 10.x.x.x and is often used for corporate networks and other larger subnets experience similar usage. People on the plane using WiFi are likely to access their corporate networks via VPN, potentially causing routing issues.
  Users VPNing into the reused address space for their own home VPN are probably knowledgeable enough to figure out what is going on and a small enough user base to not care about.
  - ordersofmag 3 days ago
    
    I'm no network guy so someone please explain why using 10.x.x.x. on a plane might "potentially cause routing issues"? It doesn't jive with what I understand about unrouteable address spaces. Is the 10.x.x.x space somehow different than the 192.168.x.x space that millions of people use VPN's out of every day (basically every WFH person on their cheap NAT'd home Wifi)?
    
    globular-toast 2 days ago
    
    Because IPv4 sucks! If you don't have enough publicly routable addresses then you are forced to use reserved ranges like 10/8. That means you'll get collisions, ie. multiple networks using the same addresses. With IPv6 you'd just get a real public IP address and all would be fine.
    Edit: I feel bad for saying IPv4 sucks. It's one of my favourite pieces of tech and an astonishingly good one at that. It just doesn't have a big enough address space.
    
    jmholla a day ago
    
    Hopefully I'm not too late to the party. When you setup a VPN, you are telling your network stack that all connections for a set of IP addresses will be handled by it, in this example case, all 10.x.x.x requests will be routed through the VPN's application. The VPN will then wrap up all requests through that connection and send them out to the Internet towards the public IP address on the other end of the VPN. To send things out to the Internet, you use your default gateway, basically an IP address everything is sent to when it doesn't match any other configured route `ip route`. If your local network is using the 10.x.x.x subnet for local connections, it will likely be 10.0.0.1 or something. But who handles that route? Your VPN which would then just recursively keep handling its own request.
    Now, I think VPN applications are smarter than that and will still get the outgoing packet to the default gateway (citation and research needed), but what happens when it doesn't know to handle a route automagically. For instance, with DHCP, a router can tell your computer what DNS server to reach out to. If that's on the local network, now you see all DNS requests actually routing into the network on the other side where you almost certainly aren't going to be talking to a DNS server. And now, you can't go to any websites.
    Hopefully this helps. I'm not the most knowledgeable about VPNs and routing, but I'm pretty sure this is all fairly accurate.
    
    klausa 2 days ago
    
    Because many of the VPNs have _their_ internal routing using 10.0.0.0/8.
    If the plane network uses 10.0.0.0/8; and then the VPN you're trying to connect to uses 10.0.0.0/8, stuff breaks.
  - Filligree 3 days ago
    
    Couldn't we spare a single extra /8 for airplanes to use?
    Though I suppose it's not worth it when you can hit 'reboot'.
    
    AStonesThrow 3 days ago
    
    How about "we" use IPv6 instead, and nobody runs out of address space ever again?
    
    yard2010 2 days ago
    
    Wouldn't this mean we have to power cycle the whole internet at the same time? Instead of supporting both which means no IPv6 in the near future
    
    mrguyorama 2 days ago
    
    I would vote for a once a year internet holiday. It would bring minor mental wellbeing improvements, coerce important industries and systems to exercise redundancy pathways, provide opportunities to have such a cutover like switching to IPv6, and remind a million petty tyrant product managers that no, our goddamned fart app does not need 6 9s of reliability.
    100 bucks says IPv6 would still not get implemented. We need legislation at this point. There's enough stubborn assholes in the networking infrastructure industry just refusing to do their job for it to happen by itself. They will insist they need to save a few thousand bucks and hold the whole damn world back.
    
    shadowgovt 2 days ago
    
    > just refusing to do their job
    Their job is to make traffic work on the chunk of the Internet they administer. If they can do that with IPv4, they're doing the job.
    If there were things unreachable by protocols other than IPv6 that people needed, that could force the issue, but there aren't.
    
    AStonesThrow 2 days ago
    
    Interestingly, there was some controversy in middle schools whereby mischievous students would deliberately disable their device's Internet access in order to play a built-in browser game, and this was seen as undesirable, so I believe that the agreed mitigation was to disable the game entirely. :-(

rich_sasha 3 days ago

Scary as it is, is there any reason for a passenger jet to have uptime if more than, say, 24hrs? Wouldn't you just switch it off and on again between every flight, regardless?

If this issue was in a car, we would never know as no one keeps their car running for 50 days straight.

ceejayoz 3 days ago

Overnight, planes tend to be plugged in to ground power, to ventilate, keep the batteries charged, for the cleaning crews, etc. Most get rebooted once in a while, but it's always possible one won't be, hence the directive to be certain.
This particular problem has been known for years (the article is from 2020).
- n_ary 3 days ago
  
  Unfortunately, an aircraft has no “reboot”. It is just a violent power cut. A lot of headache is introduced in non-critical aircraft software because there is no “graceful shutdown” or long power duration. Infact, certain hardware has an upper limit(much lower than a week) before which it needs one power cut(sometimes called power cycle) or it suffers from various buffer overflow, counter overflow and starts acting mysterious.
  - jcgrillo 3 days ago
    
    It's amazing that's legal. Like, why do we accept software that does this? It can be done in such a way that these things don't happen.Put another way, why aren't the companies involved being fined and sued out of business? Why aren't their managers facing criminal negligence charges? It's outrageous.
    
    Veserv 3 days ago
    
    Because there has never been a single commercial jetliner fatality caused by software in its intended operational domain failing to operate according to specification. That makes the commercial jetliner software development and deployment process by far the safest and highest reliability ever conceived by multiple orders of magnitude. We are talking in the 10-12 9s range.
    And just to get ahead of: “Well what about the 737 MAX”, that was a system specification error, not due to “buggy” software failing to conform to its specification. The software did what it was supposed to do, but it should not have been designed to do that given the characteristics of the plane and the safety process around its usage.
    
    shiroiushi 2 days ago
    
    >“Well what about the 737 MAX”, that was a system specification error, not due to “buggy” software failing to conform to its specification. The software did what it was supposed to do
    Exactly: the system was designed to fly the plane into the ground if a single sensor was iced up, and that's exactly what the software did. Boeing really thought this system specification was a good idea.
    
    Veserv 2 days ago
    
    That is a massive over-simplification and that invites patently false characterizations like it was a "stupid mistake" that would have been fixed if they were not stupid (i.e. adopted average development process). That is absolutely not the case. They were really capable, but aerospace problems are really, really hard, and their safety capability regressed from being really, really capable.
    They modified the flight characteristics of the system. They tuned the control scheme to provide the "same" outputs as the old system. However, the tuning relied on a sensor that was not previously safety-critical. As the sensor was not previously safety-critical, it was not subject to safety-critical requirements like having at least two redundant copies as would normally be required. They failed to identify that the sensor became safety critical and should thus be subject to such requirements. They sold configurations with redundant copies, which were purchased by most high-end airlines, but they failed to make it mandatory due to their oversight and purchasers decided to cheap out on sensors since they were characterized as non-safety-critical even if they were useful and valuable. The manual, which pilots actually read, has instructions on how to disable the automatic tuning and enable redundant control systems and such procedures were correctly deployed at least once if not multiple times to avert crashes in premier airlines. Only a combination of all of those failures simultaneously caused fatalities to occur at a rate nearly comparable to driving the same distance, how horrifying!
    A error in UX tuning dependent on a sensor that was not made properly redundant was the "cause". That is not a "stupid mistake". That is a really hard mistake and downplaying it like it was a stupid mistake underestimates the challenges involved designing these systems. That does not excuse their mistake as they used to do better, much better, like 1,000x better, and we know how to do better and the better way is empirically economical. But, it does the entire debacle a disservice to claim it was just "being stupid". It was not, it was only qualifying for the Olympics when they needed to get the gold medal.
    
    mass_and_energy 2 days ago
    
    I really don't think it takes a mastermind of software design to go "okay I've built a system that takes control of the plane's maneuverability, let's make sure we have redundant sensors on this". Furthermore, descriptions of MCAS and its role were dangerously under played so that they didn't have to tell their customers to retrain their pilots. An egregious breach of public trust in a company we put a whole lot of faith into.
    
    salawat 2 days ago
    
    >They failed to identify that the sensor became safety critical and should thus be subject to such requirements.
    Whistleblower testimony indicated it wasn't a failure to identify it as safety critical, but a conscious decision not to mention it as such to the regulator, and not implement it as a dual sensor system as doing so would have caused the design to require Class D simulator training; which Boeing was relying on the abscence of as a selling point to prevent existing airlines from defecting to Airbus.
    >They sold configurations with redundant copies, which were purchased by most high-end airlines, but they failed to make it mandatory due to their oversight and purchasers decided to cheap out on sensors since they were characterized as non-safety-critical even if they were useful and valuable.
    Incorrect. All MAX's have two AoA vanes, each paired to a single Flight Computer. The plane has two Flight Computers, one on each side of the cockpit, and the computer in command is typically alternated between each flight. One computer per flight will be considered in-command (henceforth referred to as Main), the other will be henceforth referred to as operating as "auxillary". The configuration you're thinking of is an AoA disagree light, implemented by enabling a codepath in software running on the Main FC whereby a cross-check of the value from the AoA vane networked to the auxillary FC would light up a warning light to inform pilots that system automation would be impacted, because the AoA values between the MFC and AFC differed. A pilot would be expected to recognize this as and adapt behavior accordingly/take measures to troubleshoot their instruments. Importantly, however, this feature had zero influence on MCAS. MCAS only took into account inputs from the vane directly wired to the Main FC. While a cross-check happened elsewhere for the sole purpose of illuminating a diagnostic lamp, there was no cross-check functionality implemented within the scope of the MCAS subsystem. The MCAS system was not thoroughly documented in any delivered to the pilot documentation. The program test pilot got specific dispensation to leave that out of the flight manual. See the Congressional investigation, final NTSB, and FAA report.
    >The manual, which pilots actually read, has instructions on how to disable the automatic tuning and enable redundant control systems and such procedures were correctly deployed at least once if not multiple times to avert crashes in premier airlines.
    The documentation, which included an Airworthiness Directive and NOTAM, informed pilots any malfunction should be treated in the same manner as a stabilizer trim runaway. Said problem is characterized in aviation parlance as a continual uncommanded actuation of trim motors. MCAS, notably is not that. It is periodic, and in point of fact, it ramps up in intensity over time until over 2° of travel are commanded by the computer per actuation event, with the timer between actuations being reset to 5 seconds by use of the on yoke Stab trim switches. This was ncommunicated to pilots. Furthermore, there were design changes to the Stab-Trim Cutout switches between 737NG (MAX's predecessor), and MAX. In the NG, the Stab Trim cutout could isolate the FC alone, or both FC and yoke switches from the Stab Trim motor. In MAX, however, the switches were changed to never isolate the FC from the Stab trim motors, because MCAS being operational was required for being able to checkmark FAR compliance for occupant carrying aircraft. So when that cutout was used, all electrically assisted actuation of the horizontal stabilizer became unavailable. The manual trim wheel would be the only trim input, and in out-of-trim attitudes, would result in such excessive loading on the control surface that physical actuation without electronic assistance was not feasible on the timescales required to recover the plane. There was a maneuver known to assist with these conditions (when they occurred at high altitude) called "roller coastering" in which you dive further into the undesired direction to unload the control surface to render it actuable. This technique has not been in official documentation since Dino 737 (Pre-NG). The events you're referring to when uncommanded actuations were recovered on other flights, happened at high altitudes, and were recovered with countered electrical stab switch actuation followed by Stab trim cutout within the reset 5 second watchdog timer prior to MCAS activation subsequent to a Stab-trim yoke control switch actuation. This procedure, and the implementation details needed to fully understand its significance, were undocumented prior to the two crashes. Furthermore, this procedure to cut out MCAS/the MFC from the stab trim motor and finishing the flight in a completely manually trim controlled configuration meant that technically you were flying an aircraft in a configuration that could not be certified to carry passengers when taking the FAR's prescriptively, and uncompromisingly rules-as-written with zero slack offered for convenience, because MCAS was necessary for grandfathering the MAX under the old type cert, and without MCAS functional, it's technically a new beast, which is non-compliant with control stick force feedback curves when approaching stalls, which by the way, just to make it clear, a compliant curve has been a characteristic of every civil transport in all jurisdictions worldwide for well over 50 years. This was not documented and only became apparent after investigation. Again, see the House findings, FAA report, and NTSB.
    >Only a combination of all of those failures simultaneously caused fatalities to occur at a rate nearly comparable to driving the same distance, how horrifying!
    Oh, the multi-billion dollar aircraft maker built a machine that crashes itself, gaslit it's regulators, pilots, airlines, and the flying public to juice the stock price so executives could meet their quarterly incentives, and diverted tunds away from it's QA and R&D functions to do stock buybacks, move HQ away from the factory floor, and try to union bust. With over 300 direct measurable deaths within a couple of months and multiple years worth of grounding and mandated redesigns to fix all the other cut corners we've been unearthing, and veritable billions of dollars of loss incurred in delays. Heavens, it could happen to anybody. How could you possibly see this as something to get upset about? /s
    
    Veserv 2 days ago
    
    Thank you for providing a more thorough and complete technical explanation.
    As you can see from my final statement, I made no argument that it was not a travesty. It was ABSOLUTELY UNACCEPTABLE. This is not a defense of their inadequacy.
    I was pointing out how it is absolutely incorrect to claim that it was a "stupid mistake". That argument is used by people implicitly arguing that "If only Boeing used modern software development practices like Microsoft/Google/Crowdstrike/[insert big software company here] then they would have never introduced such problems". That is asinine. As can be seen from your explanation, the problem is multi-faceted requiring numerous design failures in both implementation, integration, and incentives. In fact, the problems are even more subtle and pernicious than in my original explanation that was derived from high level summaries rather than the investigation reports themselves.
    I do not know if this has changed in the last few years, but at Microsoft you were required to have 1 whole randomly-selected person, with no required domain expertise, say they gave your code, in isolation, a spot check before it could be added. This is the same process applied regardless of code criticality, as they do not even has a process to classify code by criticality. This is viewed as a extraordinary level of process and quality control that most could only dream of achieving. Truly if only Boeing threw out whatever they were doing and adopted such heavyweight process by "best-in-class" software development houses they would have discovered and fixed the 737 MAX problems.
    Boeing does not need to adopt modern software development "best practices" and whatever crap they use at Microsoft/[insert big software company here] that introduces bugs faster than ant queens. The processes in play that created the 737 MAX already make Microsoft and its peers look like children eating glue, but they are inadequate for the job of making safe aerospace software and systems. What Boeing needs to do is re-adopt their old practices that make the 737 MAX development processes look like a child eating glue. The 737 MAX was not stupid, it was inadequate. BOTH ARE UNACCEPTABLE, but the fix is different.
    
    jcgrillo 3 days ago
    
    So what should we make of these issues described in the article? When, not if, this kind of thing kills people will it be a specification error? Will we blame it on maintenance? Surely it can't be the software's fault!
    
    Veserv 3 days ago
    
    First of all, who got blamed for the 737 MAX? Boeing did. This is one of the few industries where the responsibility does not get easily sloughed off.
    Second, 787s have been flying for ~13 years and ~4.5 million flights [1]. Assuming they were unaware of the problem for the majority of that time, their unknowing maintenance and usage processes avoided critical failures due to the stated problems for a tremendous number of flights. Given they now know about it and are issuing a directive to enhance their processes to explicitly handle the problem, we can assume it is even less likely to occur than previously which was already experimentally determined to be ludicrously unlikely. Suing someone into oblivion for a error that has never manifested as a serious failure and that is exceedingly unlikely to manifest is a little excessive.
    Third, they should be remediating problems as they arise balanced against the risks introduced by specification changes and against the alternative of other process modifications. Given Boeing’s other recent failings, they should be given strict scrutiny that they are faithfully following the traditional, highly effective remediation processes. It should only be worrisome if they are seeing disproportionately more problems than would be expected in a aircraft design of its age and are not remediating problems robustly and promptly.
    [1] https://www.boeing.com/commercial/787#overview
    
    jcgrillo 3 days ago
    
    > Suing someone into oblivion for an error that has never manifested as a serious failure and that is exceedingly unlikely to manifest is a little excessive.
    I appreciate your point of view. The air travel industry is undeniably safe, moreso than any transportation system ever. By a large margin. On the other hand, it is possible to make software systems that do not have the defects described in the article. So how do we get to the place where we choose to build systems that behave correctly? I don't think we get there without severe penalties for failure.
    
    shiroiushi 2 days ago
    
    >The air travel industry is undeniably safe, moreso than any transportation system ever.
    I disagree: the Japanese shinkansen bullet train system has never had a fatal accident, except for a single incident 30 years ago when someone was caught in a door and dragged 100 meters. No fatalities from collisions, derailings, etc., ever, since the 1960s. That's far safer than air travel could ever claim to be.
    Even other train systems have better records than commercial aviation, in general. Plane crashes are rare these days, but they still happen once in a while, and the results are usually catastrophic.
    Are planes safer than cars? Well of course, but that's a really, really low bar: cars are driven by all kinds of morons who frequently (esp. in the US) have little to no training or testing, are frequently distracted, don't have a copilot who can take over at any time, and are frequently operating in a very, very chaotic environment (like city streets). It's truly a wonder there aren't more fatal crashes. But safer than trains in general? I seriously doubt it.
    
    Veserv 2 days ago
    
    Actually, the Shinkansen seems to average ~100 billion passenger-km per year [1] or ~60 billion passenger-miles per year. Using that as a overestimate for the last 60 years, that is a grand total of 3.6 trillion passenger-miles.
    US commercial aviation averages ~1 trillion passenger-miles per year [2]. So if we compare the last 4 years of US aviation that is a comparable number of passenger-miles.
    Over the last 4 years recorded on this dataset (2019-2022)[3] it looks like there were 5 fatalities total. Over the last 4 years recorded on this dataset (2018-2021)[4] it looks like there were 2 fatalities total.
    So, while it does not appear to be safer, it is within a few factors on a passenger-mile basis. Furthermore, there are multiple periods of 4 trillion consecutive passenger-miles where there were 0 recorded accidents. It nowhere near obvious that it is “far safer than air travel could ever claim to be” and certainly a much closer race than you believed given your other assertions.
    [1] https://www.statista.com/statistics/1262752/japan-jr-high-sp...
    [2] https://www.transtats.bts.gov/traffic/
    [3] https://www.bts.gov/content/us-air-carrier-safety-data
    [4] https://www.airlines.org/dataset/safety-record-of-u-s-air-ca...
    
    shiroiushi 2 days ago
    
    That's not exactly a fair comparison, because you're comparing distances traveled, rather than trips taken. Of course planes are going to look good, since they travel much longer distances than cars or trains, and because planes are more likely to have trouble when taking off or landing than any time in-between. It's not like you can just take a commercial airliner flight to go to your local grocery store, even though statistically you're more likely to get killed on that trip than on a cross-continent flight.
    
    Veserv 2 days ago
    
    First of all, passenger-distance per event (or its inverse) is the standard metric used when comparing transportation safety. You would be hard-pressed to find any broad, rigorous comparison that does not compare on that metric. It encodes the risk of a trip to a location of a certain distance. It is absolutely a fair comparison.
    Second of all, even if we do use your metric which only cares about passenger-trips per event it still does not matter. The Shinkansen has transported ~6.4 billion people since inception. As seen in the second link I provided above, US commercial aviation serves ~900 million passengers per year. So, that is 7 years of US commercial aviation to transport the same number of people the Shinkansen has ever transported. As seen on the third link the last 7 years (2016-2022) had ~6 fatalities and as seen on the fourth link the last 7 years (2015-2021) had 2 fatalities compared to the 1 fatality on the Shinkansen.
    Third of all, given that the Shinkansen has transported ~6.4 billion people, but averages 150 million people per year and ~60 billion passenger-miles per year, we can reasonably conclude that I overestimated at ~3.6 trillion passenger-miles and it would likely actually be ~2.4 trillion passenger miles or just 2.5 years of US aviation. From the third link that would be a mere 1 fatality and from the fourth link 0-1 fatalities.
    If we extend our analysis to the last decade the third link indicates 15 fatalities over ~10 trillion passenger miles, ~2x the Shinkansen rate, and the fourth link indicates 2 fatalities over ~10 trillion passenger miles, ~50% the Shinkansen rate. Again, broadly comparable, but it is hard to truly tell which one is "safer" than the other. And again, they are clearly in the same ballpark and not dramatically different as you implied.
    
    andrewf 2 days ago
    
    > So how do we get to the place where we choose to build systems that behave correctly? I don't think we get there without severe penalties for failure.
    What failure? The planes work. This is puritanism.
    
    nullstyle 2 days ago
    
    Elevators would like a word with you.
    https://nationalelevatorindustry.org/elevators-escalators-ar...
    
    ceejayoz 2 days ago
    
    Their deaths-per-passenger-mile stats are worse, though.
    US airlines haven't had a single fatal crash in 15 years.
    https://nypost.com/2019/08/22/video-shows-moment-man-crushed...
    
    lostlogin 2 days ago
    
    > First of all, who got blamed for the 737 MAX? Boeing did. This is one of the few industries where the responsibility does not get easily sloughed off.
    The whistleblowers dying is coincidental and convenient.
    https://www.theguardian.com/business/article/2024/may/02/sec...
    
    gruez 2 days ago
    
    1. For at least one of the whistleblowers, it was certain not "convenient" because he already managed to go public with the accusation, the lawsuit was filed, and his deposition was already made.
    2. I'm not sure how a few whistleblowers dying disproves "responsibility does not get easily sloughed off". If anything, they're getting extra responsibility than is warranted. Every time there's something wrong with a Boeing product, people almost reflexively start posting about how it must be caused by corner cutting by Boeing, or how it's yet more evidence that Boeing it circling the drain. This happens even for planes that's are decades old, have a solid service history, and by all accounts are probably caused by pilot error or improper maintenance.
    
    ceejayoz 3 days ago
    
    Because it works fine. A maintenance tech gets one extra line item on the weekly or monthly inspection checklist.
    
    jcgrillo 3 days ago
    
    It works fine until it doesn't and people die. At which point the blame falls on the maintenance crew? That's wrong. And where there's smoke there's fire. If the software has this horrible bug, likely the broken culture that created it has written worse, more subtle bugs.
    
    ceejayoz 3 days ago
    
    Commercial air travel in the US is incredibly safe. The last fatal crash was in 2009.
    
    mjewkes 2 days ago
    
    I agree completely with the first part. But SWA-1380 was a commercial operating fatality in 2018. Not a crash into terrain, but the engine definitely crashed into the fuselage.
    
    faggotbreath 2 days ago
    
    [flagged]
    
    class700 2 days ago
    
    Probably not much comfort for the passenger who was ejected from the plane and died...
    
    Log_out_ 2 days ago
    
    Because changes to that software go through a enormous amount of testing, validating and documentation for a new baseline to become a flashable item. Meanwhile a always working workaround is needed now.
    
    salawat 2 days ago
    
    Have you even found the documentation around things like ACPI? It's kinda coupled with UEFI these days I think, and hell, I'm not even sure of the hardware boards/revisions aircraft makers are using these days... Are they still on BIOS? Or old-as-sin linux/RTOS kernels/microcontrollers?
    Point being, when you start talking about high QA systems, where the Quality is non-negotiable (you will have everything documented and tested); barring exec/managerial malfeasance in preventing that work from being done, you reach for the same simple things over and over again since it takes a hell of a lot of work to actually characterize and certify a thing to the requisite level of reliability/operating conditions.
    Testing ain't free, ya know.
  - ceejayoz 3 days ago
    
    > Unfortunately, an aircraft has no “reboot”. It is just a violent power cut.
    That’s a reboot.
    
    mulmen 2 days ago
    
    There’s nothing about a reboot that precludes a graceful shutdown.
    
    gruez 2 days ago
    
    There's also no reason why a "reboot" can't be a "violent power cut", especially if the equipment in question doesn't hold any state. For instance, there's no reason why you'd need to go through a shutdown sequence for a printer.
    
    theviat 2 days ago
    
    Please tell my printer that. It becomes _very_ grumpy if it loses power instead of being shut down via its off button.
    And then it turns itself off if it's not used for a while. I hate printers.
    
    mulmen 2 days ago
    
    A reboot is just a boot after a shutdown. It doesn’t matter what kind of shutdown that is.
  - th42o34234234 2 days ago
    
    This has to be a joke right ?
    You're telling me Aerospace's "real engineering-level" is worse than something a sophomore can cook up ?
    
    morcheeba 2 days ago
    
    The testing for aerospace is extremely rigorous ... For DO-178C level A (Catastrophic failure that can cause a crash or many fatal injuries) we're estimating 2 years to do MC/DC test coverage metric of a fairly basic software system that has two mechanical backups. And that's above and beyond the extensive unit tests.
    The main thing that gets checked is the worst-case timing analysis for every branch condition. And there are stack monitors to monitor if the stack is growing in size.
    Look at Rapita System's website for more info ... we don't use them, but they explain it well.
    
    reportgunner 2 days ago
    
    Wait till you hear about boeing in space.
  - kulahan 3 days ago
    
    >an aircraft has no “reboot”. It is just a violent power cut
    Guess how I typically reboot things :)
    
    thfuran 3 days ago
    
    By traveling to Mexico and laying out bait along the migratory path of the butterflies?
sitkack 3 days ago

Many car's control units continue to run while the car is off. If you want to reboot your vehicle, you need to unplug the 12v battery for at least a minute.
- jcgrillo 3 days ago
  
  On some cars (recent VWs in particular) when you plug the battery back in you need to twiddle some settings in the computer otherwise the charging circuit will fry the battery prematurely. We've gotten ahead of our skis with this nonsense, time to rein it in.
  - symisc_devel 3 days ago
    
    This issue is notorious for BMW cars. You have to notify the ECU each time you install a new battery.
    
    jcgrillo 3 days ago
    
    It's hard to imagine an interpretation of this behavior that doesn't involve manufacturers trying to punish independent mechanics and end users who service their own cars. Like, there's no way it's an "honest mistake", right?
    BTW I have an AGM ("advanced glass mat") battery in my 1995 Toyota which has a completely analog charging system, and it doesn't get cooked, so it's not because there's something special about the battery.
    
    HeyLaughingBoy 3 days ago
    
    Don't attribute to malice what can easily be explained by overstressed Systems Engineers trying to resolve multiple conflicting Requirements.
    
    jcgrillo 3 days ago
    
    My point is there was absolutely no need for the System Engineers to touch the charging system. The normal analog diode rectifier variety that has been standard since the 1960s is Good Enough. No "Innovation" Needed. Take your spacecamp nerds elsewhere.
    
    4gotunameagain 2 days ago
    
    Sure, you MUST know better than the BMW engineers who designed the feature we have zero information about.
    
    whatevaa 2 days ago
    
    Engineers often do stuff without any thought of maintenance. Just ask mechanics/maintenance personnel.
    
    4gotunameagain 2 days ago
    
    Having performed repairs on a BMW motorcycle, I am quite aware. It is a good point, but I highly doubt that it would play a role in this case. There must be something there that we are missing.
    
    Log_out_ 2 days ago
    
    at this point anything is possible: they barely write the specs
    https//www.heise.de/en/news/BMW-Huge-recall-and-profit-warning-due-to-defective-Conti-brakes-9864793.html
    
    tdullien 2 days ago
    
    "Somebody needed to get promoted"
    
    chiph 2 days ago
    
    That's because BMW ECUs adapt to the lower voltage as the battery ages and instruct the alternator/charger to provide more current. Replace the battery and the ECU would cause it to be overcharged unless you notify it of the replacement. Yes it's an over-engineered system, but ... German car.
    
    reportgunner 2 days ago
    
    Sounds like an afterbender straightener architecture.
    
    dzhiurgis 3 days ago
    
    Ahhh, "program a new battery" $400 please.
  - RichardHesketh 3 days ago
    
    Rein. It’s about controlling a horse, not an entire nation.
    
    jcgrillo 3 days ago
    
    Thanks, I blame phone autocorrect
    
    AStonesThrow 3 days ago
    
    It's always champing at the bit
themoonisachees 3 days ago

Some of these planes are constantly flying as long as they're not in maintenance. A plane not in the air is a plane the company bought that's not currently generating profit.
fnordpiglet 3 days ago

I’ll bet you the typical EV stays powered on 24/7 with reboots around OTA updates.
- garyfirestorm 3 days ago
  
  unsure what you mean here. most of the systems go to a sleep state in modern vehicles ev or not. the 12v battery keeps only certain ECU's up - think ECUs that control alarm, lock and unlock state and any communication with the mobile app via LTE... but the rest of the systems are OFF, you don't want an EV battery to hit 0% and 12V to also hit 0% - that would basically make it a brick from what I understand- because EV's have contactors which need to shut for the battery to be 'engaged' the 12V battery controls these contactors.
  - fnordpiglet 3 days ago
    
    A car with an enormous rack of high capacity batteries able to accelerate an 8000 pound object to 60mph and sustain that for hundreds of miles generally doesn’t depend on the backup battery for literally anything. It has so much excess energy storage in the form of electricity in the primary batteries it generally doesn’t power down the onboard computers at all.
    Indeed when you get close to exhausting the main battery rack it starts selectively shutting down everything. I’ve never personally let mine get to 0% ever - but for instance a Tesla is continuously on, and if you use sentry mode it’s not just on but the GPU is constantly doing classification of the environment to determine if someone is prowling your vehicle.
    
    thatfrenchguy 2 days ago
    
    Every EV depends on the 12v battery for starting up / has the HV battery off when your car is off, that's why if the 12v battery is dead your car won't start.
  - Kirby64 3 days ago
    
    Low voltage battery death in any EV essentially causes a brick. The only exception is some cars (I think Tesla does this?) keep their contactors closed all the time when the 12v is determined to be failing. It makes the drain at idle much higher, but then at least it can continue moving… as long as you don’t let the HV pack drain…
n_ary 3 days ago

Very strange, because for me, an aircraft(medium) is never alive for more than 24h. A big one like 787 may be alive for up to 72h(assuming longer routes). 50 days for me would be a dream and a lot less headache but it is very expensive to keep an aircraft powered that long with ground power.
- rogerrogerr 2 days ago
  
  > it is very expensive to keep an aircraft powered that long with ground power.
  Why do you say this?
potato3732842 3 days ago

I know someone on the north slope of Alaska. He does not turn his personal truck off all winter. This is even more typical for semi trucks and whatnot around there.
yard2010 2 days ago

I think it's about the worst case scenario. You wouldn't want this to happen even rarely, especially when it can be solved by putting more time (and god forbid, money) into R&D.
sheepybloke 3 days ago

Airlines will run the aircraft as long as possible. As another commenter mentioned, if an aircraft isn't in flight, it's in maintenance. All of these times it's on.
rodgerd 3 days ago

It's another thing on a checklist that can go wrong.

tomudding 3 days ago

(2020)

jmrm 2 days ago

51 days? That looks like the old Windows 94/98 bug, where it used a 32-bit variable to store uptime in milliseconds

sgarland 2 days ago

There was a similar problem with a specific generation of 688-class submarines, where a calculated temperature would slowly drift. The metric wasn’t used for any protective actions, so it wasn’t a “shut down immediately and return home surfaced on the diesel” situation, but still disconcerting.

I assume that after this the software was soak-tested for weeks / months to eliminate that class of bug. Naval Reactors is many things, but repeating the same mistake twice isn’t one of them.

avelis 3 days ago

In the software world I call this an end user discovered issue. But when the issue involves a plane that is carrying actual souls. That can feel very scary.

I am sure this has been resolved by now since its from 2020.

recursive 3 days ago

I don't think airplane software ships updates the way npm packages do. I would be more surprised if this is fixed.
- advisedwang 3 days ago
  
  I think from the point of view of Boeing, the FAA and the airlines, "put it in our maintenance checklist to reboot every 51 days" is a fix.
  - woah 3 days ago
    
    With that framing, this sounds like one of the easiest maintenance tasks imaginable. No wrenches or grease involved.
- thecosmicfrog 3 days ago
  
  > I don't think airplane software ships updates the way npm packages do.
  I'd ideally like to sleep tonight, thanks.
- trollied 2 days ago
  
  They do get software updates. Watch "Stig Aviation" "Stig Shift" series on youtube. He's shown how to do updates in a few of his videos.
AmVess 3 days ago

Scary would be right.
Reminds me of the F-22 Raptor crossing the International Dateline error in 2007. They were flying a squadron of them from Hawaii to Japan. They crossed the IDL and all nav/fuel systems went down, as well as some communications gear.
They only made it back because they were flying with tankers at time, who led them back to base.
- extraduder_ire 2 days ago
  
  Was that a coordinate thing, a timezone thing, or something else?
  I'm assuming the former.
Dylan16807 3 days ago

That depends on how much code was having trouble, and what you mean by "resolved".
The safe option might be to avoid the situation, and I could imagine that even if there is a code update it might just make the plane balk at getting ready to take off after a certain amount of uptime.

justmarc 2 days ago

There are just too many worrying signs from Boeing in the last years.

I have no idea about these things at all but some of the issues seem almost unforgivable to me.

They should work very hard for the industry, and the ultimate end users to regain confidence in them again. I'm not sure they are doing this.

qxfys 2 days ago

It sounds like my random Raspberry Pi sitting somewhere in my server room that has to be restarted every <x> weeks.

olabyne 2 days ago

Really ? Mine has an uptime of a year or so, it resets only if a big storm stopped the main power for a few seconds. Maybe it is the new hardware ? I have the original one (arvm6, 512 Meg of RAM)
louwhopley 2 days ago

Same same but different
tonyedgecombe 2 days ago

My TV needs regular reboots, about every six weeks.

joejohnson 3 days ago

This was news in 2020. Has it been fixed?

bandyaboot 2 days ago

If problems persist after rebooting, you may need to use a giant paperclip to perform a reset.

pulse7 2 days ago

"Reboot tut immer gut!" (Reboot is always good!)

DiggyJohnson 2 days ago

I'm honestly impressed that the Register included a prominent blurb explaining to the reader that while this sounds like a catastrophic issue, the most likely outcome if this is experienced in flight is a safe and controlled landing.

> Sidenote > > Pitch and power is a simple concept. If you have the throttles, say, three-quarters open and the nose of the aeroplane is pointing a few degrees above the horizon, chances are you're probably flying straight and level at a safe speed. Training manuals normally contain a number of precise pitch and power settings (they vary between aeroplane types) so if display systems start failing, pilots can fall back to these with confidence.

tedunangst 3 days ago

And 4.5 years later, what's new?

akira2501 3 days ago

> This alarming-sounding situation

That's not what's alarming to me. What's alarming is that the plane could possibly be in a position to be continuously powered on for 51 days in the first place.

stavros 3 days ago

When a minute of downtime costs thousands, why wouldn't you expect planes to be in constant utilization?
- akira2501 3 days ago
  
  > why wouldn't you expect planes to be in constant utilization?
  They require weekly maintenance which takes them out of service for at least 12 hours.
  What we may of as 'constant utilization' is quite different in a regulated fleet environment like airlines.
  - hinkley 3 days ago
    
    maintenance would happen with the aircraft in 'wheels on ground' mode but that may not mean all systems are turned off. I expect it's like a bug in the SMC on a computer. To really turn it off you have to do some magic.
  - stavros 3 days ago
    
    "Constant utilization" means "they aren't sitting idle", not "they aren't undergoing necessary maintenance ever".
- fallingknife 3 days ago
  
  The number of flights varies a lot by time of day, so there is nothing close to constant utilization.
  - Filligree 3 days ago
    
    There's not much reason to turn them off outside of maintenance. When they're parked, they're connected to grid power.
    
    thecosmicfrog 3 days ago
    
    Airliners are regularly and routinely shut down. "Cold and dark" is a common startup procedure for the first flight of the day.
    
    n_ary 3 days ago
    
    A parked Aircraft is not kept powered when there are no maintenance or other routine(cleaning/checks/certification/preparation/restocking etc.)
    It is very surprising that how a lot of comments here claim the contrary.
    Even when parked for next flight, until resupply and cargo routines are declared, it is also not powered.
  - CactusOnFire 3 days ago
    
    I've flown with airlines before where there was a cascading delay due to a "plane deficit" at the terminal (not the technical term, that's my own). Not to say it's always uptime, but I imagine there are instances of constant uptime.
    
    fallingknife 3 days ago
    
    They can't just change things up on a dime like that. Even if it's 3 AM and most planes are sitting on the ground they can't just be used for your flight like that because they are all scheduled to take off in the morning rush a few hours later.

tessierashpool9 2 days ago

easy:

  while(true) {
    if(
      (date.today() - date(this.system.uptime) >= 51)
        && !this.sys.isFlying
    ) {
      this.sys.resetNow();
    }
    time.sleep(1000);
  }

boohoo123 2 days ago

well now your system doesn't do anything because its stuck in a forever loop checking the time. it's most likely programmed in C so you can remove the OOP as well.

jcelerier 3 days ago

51 days * 86400 seconds * 1000

=> 4406400000

2^32

=> 4294967296

the coincidence seems unlikely, it's basically ~~5 hours and a half~~ 30 hours of difference if one has a 1-ms counter increment

sitkack 3 days ago

Watch Windows 95 crash live as it exceeds 49.7 days uptime https://news.ycombinator.com/item?id=28340101
Must be a northwest washington thing.
Dylan16807 3 days ago

It's a day and a half difference, and since 2^32 is the smaller number that would be pretty catastrophic. Pretty likely it's coincidence.
thamer 3 days ago
Where did you get 5 hours and a half? It seems to be closer to 31 hours:
```
    >>> round((4406400000 - 2**32)/(1000 * 3600), 3)
    30.954
```
- jcelerier 3 days ago
  
  from me typing too quickly in bc, apparently :')
throwbadubadu 3 days ago

Not getting it.. yeah the famous 32 bit ms overflow after 49 something days. But why then 51 here? Shouldn't they be required to reboot after 49 days please please? :D
- tines 3 days ago
  
  Possibly cumulative error in the timing source?
  - hinkley 3 days ago
    
    It's possible to run tasks instead of starting every second, starting one second after the previous iteration finishes.
    So if you have something that checks the system health every millisecond, and keeps a count instead of a duration, then if it takes a couple microseconds to complete you might get something less than 86 million ticks per day instead of 86.4 million.
    
    Jtsummers 3 days ago
    
    The OS used on the 787 has a hard real-time scheduler. Tasks are started up at a specific frequency (set per task), run to completion or to the end of their time slot (set per task) and terminated. We had, IIRC, a strict 100ms slot for our bit of LRU software to do everything and it would be launched every 1s (from memory, that was 15 years ago). Information could be stored between executions so partial completion is something you could handle if needed by storing state information and using it at the start of the next iteration (we didn't need that, our tasks finished in the slot).
    You don't base the start of a future task on the end of the prior one, you base it on a fixed clock for these kinds of systems.
    
    tedunangst 3 days ago
    
    Or maybe it's aliens and their strontium-89 wormhole collapses after 51 days. At this point we're just making shit up.
  - jcelerier 3 days ago
    
    Or just ticking every 1.025 ms (e.g. at 975 Hz instead of 1khz)... that brings us to :
    (4406400000 - 1.025*2 ^ 32)/1000
    so a difference of 1.12 hours with the "51 days" mention.
  - icelancer 3 days ago
    
    This is even scarier than the base concern.
- amelius 3 days ago
  
  Maybe it takes 2 days to boot the entire thing?

dgoldstein0 2 days ago

This should carry a label: 2020. This article is 4.5 years old

dang 2 days ago

Added. Thanks!

boohoo123 3 days ago

this is what happens when you hire based on checked checkboxes and not qualifications.

xyst 3 days ago

This company just can’t stay out of the news. Their planes are trash. Software is straight garbage. Many people have died because of this company and suffered undue stress/anxiety because of the massive dip in quality.

Boeing engineers/builders caught on audio stating they wouldn’t be caught dead in their own planes unless feeling suicidal.

zamadatix 3 days ago

The company definitely can't stay out of the news and it's gone downhill over the recent years but you've picked an interesting post to lament about those on. The news they can't stay out of is over 4 years old in this case. The model of plane it's about (787) has never had a single fatality despite >15 years of operations and >1,000 units operating today. In all, deaths are probably the worst possible metric to berate Boeing on - including every death (e.g. hijackings, not just engineering failures) their popular 747 line has had comes to <6,000 fatalities despite carrying billions of passengers over a period of >50 years.
Despite their ever increasing incompetence on delivery speed, test compliance, and innovation... commercial air travel with Boeing (and other major air manufacturers) has always been one of, if not the, safest mechanisms of travel we've ever executed on. Particularly the last 5 years have been the safest period in terms of air travel deaths or injuries.
None of that means we shouldn't criticize Boeing by any means, just that doing it over perceived death and accident counts because of what news headlines imply is complete nonsense in terms of actual numbers no matter how you slice it. It's important those kinds of things are reported but it's equally important to not get swept up in paranoia over it.
- gs17 2 days ago
  
  Agreed, my 737 fears were relieved by researching how many of them are in the air at any moment, how many millions of trips they fly each year, how old airframes can get before they get retired, etc. Even the "worse" models are feats of engineering.