FBXL Social

I mostly agree here. I haven't read the Doctorow piece. But I've been having a similar conversation within my professional circles. Yes Crowdstrike screwed up. But humans are gonna screw up. We know this. So rather than discussing who to blame, the better discussion is how so many companies found themselves exposed with no way of taking control of what was happening to their systems.
https://hachyderm.io/@jenniferplusplus/112836876914885920

I would make a stronger statement. For a long time, I've been feeling like this is the major flaw in our current trajectory. The vendorization of everything is a trap. Not only because of the huge "blast zones" as Jennifer puts it. But because we've also created an environment where dumping risk onto vendors effectively means accountability is diffused to the point of uselessness.

Crowdstrike is gonna see some consequences for this. But I suspect they won't be as severe as people think. Okta is still around after failing at their one job multiple times. So is Lastpass. Vendor failures are becoming normalized at a rapid pace. And meanwhile the companies who are delivering direct service largely dodge accountability by just blaming their vendors.

My mom has been trying to fly back from San Francisco to Atlanta since Friday. Delta Airlines has been totally hosed by this issue. I guess they were deeply invested in windows and Crowdstrike.

But what decision does Delta get to make now? What can they change that won't expose them to a potential Crowdstrike or a similar vendor exposure? I don't think they have that option. The whole ecosystem is set up to shed risk in a way that makes accountability impossible.

@polotek

Yeah, non-tech folk grossly underestimate the complexity and scale big co's like Delta have invested in a given platform. Switching would take decades.

All rooted in basically arbitrary decisions made years ago because some exec bought a sales pitch and went all in on a marketing demo.

Not slighting anyone, though. This crap is hard and impossible to predict. This was (in Delta's eyes) a minor vendor.

@jrconlin @polotek I'd slight them for allowing that unqualified 'exec' to make that decision. That, in a nutshell, is why we (as a human society and global environment) can't have nice things.

@polotek Numerous conversations:
β€œAnd what if <vendor> is down?”
β€œWell I guess we’d be down too, nothing we can do about that”
β€œAre you sure about that? Do our customers care that we use <vendor>?”
πŸ¦— πŸ¦—

@polotek Yeah sometimes redundancy can be super expensive. But there might be options for fail-over or partial degradation. Worth considering.

@ak I think there are many things that are vendorized and outsourced that should instead be licensed and integrated.

@polotek @ak there needs to be a far far greater skill density in 'the enterprise', but there isn't, because of . Which is why we shouldn't have public corporations any more.

"abolish the corporation"

Now we're talking. Let's goooooooo

@sj_zero @ak @polotek I'm specifically talking about 'publicly listed profit-motivated corporations', i.e. https://davelane.nz/megacorps - not, say, charitable foundations (which, technically, can also be 'corporations' in some jurisdictions)...

@lightweight @jrconlin who is supposed to make the decisions if not "execs"? Are you coming imagining some kind of all-knowing Uber exec that could do this better but doesn't for some reason?

@polotek @jrconlin no. I'm imagining hard-core technologists with a broad (e.g. liberal arts) background, well traveled and well-read. Not some airline lounge magazine reading 'tech enthusiast" exec who doesn't think that presiding over a 'Microsoft Shop' should be a mark of ultimate shame (like I do), i.e. almost all of the CIOs and CTOs I've ever met.

@polotek @jrconlin I know many of the former, but far more of the latter.

Exactly what I had in mind too.

In my mind, part of the problem of corporations is that it's designed specifically to make sure they can be super big and mitigate the stuff that can limit the size of businesses. I mean, the guy who started Crowdstrike was in charge of Mcafee when that company took down half the Internet in 2010. If you were in charge of two enterprises that did such a thing, why are you still allowed to have your fingers in half the Internet? Because everything is set up to be compartmentalized enough legally that everyone can avoid consequences even when the worst possible outcome occurs.

(Editing since apparently my brain misfired on both company names)
replies
0
announces
2
likes
5

@polotek @lightweight

Nope. Just drawing from personal experience.

Execs make decisions based on different criteria, and often, don't stay long enough to see the consequences. In this case, I'm betting there was some level of due diligence done, but ultimately it came down to price and features.

Then it got forgotten about since it didn't "catch fire".

Granted, spending money on infra is really hard to justify, so it's usually starved anyway.

@jrconlin @lightweight I'm not saying you're wrong about what happens sometimes. But I think it's way too easy to just blame "some exec". In my experience, engineers do not stand up and say "we shouldn't do this". And even if the decision is made for them, they still have autonomy to do lots of things that might mitigate the worst case scenarios. We can't fix "some exec". We can look critically at our own actions and what we do have control over.

@polotek @jrconlin I'm all for engineers finding some ethical code and the chutzpah to stand up to stupid sociopathic corporate exec ideology. For what it's worth, I've built a my whole career on doing that.

@polotek @lightweight

I'm not sure. (This dances on my opinion that companies love Imposter Syndrome and no mentoring because it helps retention and bullying.)

Again, my experience, but the folk willing to work on this stuff don't stick around long because it's "dead end" work. So new folk come and they just think this is normal infrastructure. Give it a few years and the change-over cost becomes huge.

Should companies be better? Sure, but there's no motivation.

@jrconlin @polotek I blame structural factors, i.e. the single motivation (maximising shareholder value) of public corporations which enures they're in a race to the bottom. Those who work to make them ethical and accountable are ultimately turfed out by shareholders or their agents in the corporate structure.

@polotek @lightweight

In an ideal world, companies run full audits on systems yearly. They keep things up-to-date and have thick contingency binders for when systems fail, along with drills and "chaos monkey" type work.

In reality, all that costs money, and no one wants to be a cost center because that makes them targets for the next round of layoffs.

@jrconlin @polotek yep. Which comes back to that structural argument.

@lightweight @jrconlin sounds like everything is working as designed then. We get paid good money to not care if everything does too shit. That sounds like a great deal. Come to think of it, I'm not sure why so many tech people seem upset about this. Not really our problem right?

@polotek @jrconlin well, that's one way to look at it... but I'd say that history will see rightly the engineers of evil technologies as culpable, especially when they're accepting money not to exercise their consciences, as you say. As a conscience-driven engineer, I know I hold them responsible.

@lightweight @jrconlin but you didn't. You explicitly said "some exec" is way more culpable and there's nothing engineers can really do. I gave you a lot of room to say something besides that and you doubled down. What am I missing?

@polotek @lightweight

Well, yeah. There's conflict. At some companies, groups of highly positioned people will become "decision makers" and may or may not take input from folk that understand the problem scope.
Likewise, there's often company churn, so that folk that made the decision may leave shortly after. Add in the desire to find someone accountable, the folk in the trench are usually impacted more for things they don't control.
Add in that complex systems are complex, and, yeah...

@polotek @lightweight

Then add in various "cost cutting" that orgs do where they try to make teams "leaner". They drop QA teams because devs can do that, right? (QA is a skill. They should not be running unit tests. They should be friendly hackers that document their exploits.)

Plus there's pressure to add features rather than maintain systems because that drives sales.

Is there a person to blame? Nope. Is there a system or mindset to blame? Probably.

Granted, nothing will change from this

@jrconlin @lightweight "diffusion of accountability". It has been fascinating for people to go all the way around the block and land right back where I started.

The only thing people can imagine changing this dynamic is some government hammer that comes down and magically changes everybody's incentives.

@polotek @lightweight

Heh, yeah, I absolutely wasn't disagreeing or whatever with you at all. I'll admit I was grousing about it from my own point of view.

Honestly, I even wonder if a government hammer would work? There's a clear financial one at play and that's not been terribly effective. A government mandate would just become something else to route around.

I guess I'd be happier if platform security became a "hot new business trend", like AI.

@jrconlin @lightweight platform security is already huge. And there are tons of regulations in place in terms of compliance. We are already making companies jump through lots of hoops to "prove" they're following good practices.

But as systems get more complex, there need to be more and more practices. Along with more and more regulation to force people to follow them. It's unsustainable.

@polotek @jrconlin a thing that would change this is for corporations, with their one priority (maximise shareholder value) to become illegal. They lose 'personhood', and need to prioritise, for example, environmental sustainability and maximise social benefit. A regulated change of incentives would be a bare minimum.

@lightweight @jrconlin you're still using passive voice Dave. Who is gonna do that? It's not magic.

But listen, I think we understand each other. It feels like we've done all we can do in this discussion. Let's leave it here. Cheers.

@polotek @jrconlin heh heh. Ok. It has to be a political movement to change the status quo, which is entirely broken. Right. Leaving it.

@polotek @jrconlin @lightweight

Case in counter-point: you could call it broken ticket-toss buck-pass subculture perversely incentivized.

We said "no agent complexity, or if you must, it will have phased and tested/metered roll-outs of changes". We were adamantly overruled. They said "we accept the risk of total revenue outage if this agent breaks catastrophically" and (Catch-22) "you must still ensure no outage" and "you must get budget elsewhere to completely re-engineer your service".

@polotek @jrconlin @lightweight

Shitty perverse unfunded mandates. For a buggy-assed agent (falcond in the present case) that chews power and cpu, debilitatingly, for no real security benefit, even when it's not entirely destroying the business.

Complexity is the enemy of secure, stable ops. If we still had a sec team instead of outsourcing to vendor apps that can't be run well, we'd be way ahead of the wildfire burn-down...

But we don't. I'm ready to just f'in quit.

@tab2space @jrconlin @lightweight who is "they"? Overruled by who?

@polotek @jrconlin @lightweight

In practice, any one of 6 "security" teams with bureaucratic power but no accountability for their actions or effects of their mandates.

It's the opposite of "let's find good ways to secure the business"

@tab2space @jrconlin @lightweight if you didn't get fired, then it sounds like it's working. Accountability doesn't necessarily mean someone should experience severe consequences. And even if they do, consequences don't always look like firings or some such.

One of the things that always interesting to me is how cutthroat employees like to be when they're talking about what should happen to other people. Even while reserving lots of grace for themselves personally.

@polotek @jrconlin @lightweight

I'm not on the firing line for the service that will be affected. (I'll be fired for other things.)

My systems, and the org, rely on the service that will be affected again by a single point of failure CrowdStrike (a bad update in a non-staged rollout without prior testing or QA).

Breaking MS WIndows this time is not too much different from breaking some linux kernels a couple months ago. CrowdStrike didn't learn there; they simply escalated to Windows... ;-)

@polotek @jrconlin @lightweight

The core point remains: Orgs will have teams that will "accept" a risk, but who do not provide budget for re-engineering to minimize that risk, and who do not bear any consequences for when the inevitable outages happen. They in effect outsource accountability for their requirements.

Those above them believe FUD and let them (try to) get away with it.