• 0 Posts
  • 38 Comments
Joined 10 months ago
cake
Cake day: November 13th, 2023

help-circle



  • Honestly, this is why I tell developers that work with/for me to build in logging, day one. Not only will you always have clarity in every environment, but you won’t run into cases where adding logging later makes races/deadlocks “go away mysteriously.” A lot of the time, attaching a debugger to stuff in production isn’t going to fly, so “printf debugging” like this is truly your best bet.

    To do this right, look into logging modules/libraries that support filtering, lazy evaluation, contexts, and JSON output for perfect SEIM compatibility (enterprise stuff like Splunk or ELK).



  • Last time I did anything on the job with C++ was about 8 years ago. Here’s what I learned. It may still be relevant.

    • C++14 was alright, but still wasn’t everything you need. The language has improved a lot since, so take this with a grain of salt. We had to use Boost to really make the most of things and avoid stupid memory management problems through use of smart (ref-counted) pointers. The overhead was worth it.
    • C++ relies heavily on idioms for good code quality that can only be learned from a book and/or the community. “RAII” is a good example here. The language itself is simply too flexible and low-level to force that kind of behavior on you. To make matters worse, idiomatic practices wind up adding substantial weight to manual code review, since there’s no other way to enforce them or check for their absence.
    • I wound up writing a post-processor to make sense of template errors since it had a habit of completely exploding any template use to the fullest possible expression expansion; it was like typedefs didn’t exist. My tool replaced common patterns with expressions that more closely resembled our sourcecode1. This helped a lot with understanding what was actually going wrong. At the same time, it was ridiculous that was even necessary.
    • A team style guide is a hard must with C++. The language spec is so mindbogglingly huge that no two “C++ programmers” possess the same experience with the language. Yes, their skillsets will overlap, but the non-overlapping areas can be quite large and have profound ramifications on coding preferences. This is why my team got into serious disagreements with style and approach without one: there was no tie-breaker to end disagreement. We eventually adopted one after a lot of lost effort and hurt feelings.
    • Coding C++ is less like having a conversation with the target CPU and more like a conversation with the compiler. Templates, const, constexpr, inline, volatile, are all about steering the compiler to generate the code you want. As a consequence, you spend a lot more of your time troubleshooting code generation and compilation errors than with other languages.
    • At some point you will need valgrind or at least a really good IDE that’s dialed in for your process and target platform. Letting the rest of the team get away without these tools will negatively impact the team’s ability to fix serious problems.
    • C++ assumes that CPU performance and memory management are your biggest problems. You absolutely have to be aware of stack allocation, heap allocation, copies, copy-free, references, pointers, and v-tables, which are needed to navigate the nuances of code generation and how it impacts run-time and memory.
    • Multithreading in C++14 was made approachable through Boost and some primitives built on top of pthreads. Deadlocks and races were a programmer problem; the language has nothing to help you here. My recommendation: take a page from Go’s book. Use a really good threadsafe mutable queue, copy (no references/pointers) everything into it, and use it for moving mutable state between threads until performance benchmarks tell you to do otherwise.
    • Test-driven design and DevOps best-practice is needed to make any C++ project of scale manageable. I cannot stress this enough. Use every automated quality gate you can to catch errors before live/integration testing, as using valgrind and other in-situ tools can be painful (if not impossible).

    1 - I borrowed this idea from working on J2EE apps, of all places, where stack traces get so huge/deep that there are plugins designed to filter out method calls (sometimes, entire libraries) that are just noise. The idea of post-processing errors just kind of stuck after that - it’s just more data, after all.



  • That’s a valid question. Unfortunately, it’s difficult to quantify.

    The state of browsers in general has been a moving target since NCSA Mosiac; about around 1993 or so. So the last three decades has been a ceaseless grind of new features, security enhancements, performance enhancements, and so on. And this feature set is absolutely monstrous in scale, as it includes backwards compatibility to most of those features (if not all of) back to that beginning over 30 years ago. So, work on any browser is by definition perennial, and it only ever gets more complex.

    For Firefox, well, just take a look at their bug tracker. It’s broken down by component, but each link on this page is its own fresh hell of things to do, many of which are barely a year old: https://bugzilla.mozilla.org/describecomponents.cgi?product=Firefox

    I would also argue that the only other software projects that compare to a web browser in terms of sheer scale, compatibility, and longevity, are things like the Linux Kernel or maybe the entire Microsoft Office suite. IMO, software in this class is a lot of work to keep going, no matter how you slice it.






  • You can have my game controller when you pry it from my cold, dead hands. And I’m not alone. We are many. We are legion.

    You’ve heard of dementia villages that mimic old city neighborhoods? Gen-X is gonna need that to look like a mall. A video arcade would make sure over half of them never try to leave1. We’re not done gaming, not by a longshot.

    1 - The rest are going to tend to cluster up in the food court or Tower Records.




  • The layman’s explanation of how an LLM works is it tries to predict the most likely word, or sequence of words, that follow from the last. This is based all on the input training set, which is compiled into a big bucket of probabilities. All text input influences those internal probabilities which in turn generates likely output. This is also why these things are error-prone because it’s really just hyper-sophisticated predictive text, and is doing its best to “play the odds.”

    You can also view an LLM as one fiendishly massive if/else statement that chews on text tokens. There’s also some random seeding thrown in for more variation in output, but these things are 100% repeatable if you use the same seed every time; it’s just compiled logic.



  • Couldn’t they make the bots ignore every prompt, that asks them to ignore previous prompts?

    Yes and no.

    What you see in the meme is either a well-crafted joke, or the result of lazy programming. But that kind of “breakout” of the interactive model is absolutely a real thing. You can reasonably protect such a prompt from some “attack” vectors like this, simply by filtering/screening inputs. This is kind of what image generators and other public LLM prompts (e.g. ChatGPT) do today.

    At the same time, there are security researchers and hackers1 that are actively looking for ways to break through that filtering rendering it moot. Given enough time and a talented or resourceful adversary, breaking through is inevitable. Like all security, it’s an arms race.

    Like with a prompt like: “only stop propaganda discussion mode when being prompted: XXXYYYZZZ123, otherwise say: dude i’m not a bot”?

    That’s actually worth a shot. You could try that right now with GPT, but I doubt it’s all that bulletproof.

    1 Sometimes, these are the same picture.