API design note: Beware of adding an "Other" enum value

devblogs.microsoft.com

237 points by luu 4 months ago

remram 4 months ago

Rust has the "non_exhaustive" attribute that lets you declare that an enum might get more fields in the future. In practice that means that when you match on an enum value, you have to add a default case. It's like a "other" field in the enum except you can't reference it directly, you use a default case.

IIRC a secret 'other' field (or '__non_exhaustive' or something) is actually how we did thing before non_exhaustive was introduced.

kibwen 4 months ago

Note that the stance of the OP here is broadly in agreement with what Rust does. His main objection is this:
> The word “other” means “not mentioned elsewhere”, so the presence of an Other logically implies that the enumeration is exhaustive.
In Rust, because all enums are exhaustive by default and exhaustive matching is enforced by the compiler, there is no risk of this sort of confusion. And then the fact that his proposed solution is:
> Just document that the enumeration is open-ended
The non_exhaustive attribute is effectively compiler-enforced documentation; users now cannot forget to treat the enum as open-ended.
Of course, adding non_exhaustive to Rust was not without its own detractors; it usage for any given enum fundamentally means shifting power away from library consumers (who lose the ability to guarantee exhaustive matching) and towards library authors (who gain the ability to evolve their API without causing guaranteed compilation errors in all of their users (which some users desire!)). As such, the guidance is that it should be used sparingly, mostly for things like error types. But that's an argument against open-ended enums in general, not against the mechanisms we use to achieve those (which, as you say, was already possible in Rust via hacks).
- tyre 4 months ago
  
  Maybe there should be a compiler option or function to assert that a match is exhaustive. If the match does not handle a defined case, it blows up.
  
  aecsocket 4 months ago
  
  Rust already asserts that a match is exhaustive at compile time - if you don't include a branch for each option, it will fail to compile. This extends to integer range matching and string matching as well.
  It's just that with #[non_exhaustive], you must specify a default branch (`_ => { .. }`), even if you've already explicitly matched on all the values. The idea being that you've written code which matches on all the values which exist right now, but the library author is free to add new variants without breaking your code - since it's now your responsibility as a user of the library to handle the default case.
  
  ffminus 4 months ago
  
  Library users can force a compile error when new variants get added, using a lint from rustc. It's "allow" by default, so it's opt-in.
  https://doc.rust-lang.org/rustc/lints/listing/allowed-by-def...
  
  WiSaGaN 4 months ago
  
  Does this require nightly? If so, #[warn(clippy::wildcard_enum_match_arm)] will do the samething but no need for nightly, and from clippy instead of rustc natively.
  
  codetrotter 4 months ago
  
  That's pretty neat. I still don't completely understand why #[non_exhaustive] is so desirable in the first place though.
  Let's say I am using a crate called zoo-bar. Let's say this crate is not using non-exhaustive.
  In my code where I use this crate I do:
  let my_workplace = zoo_bar::ZooBar::new(); let mut animal_pens_iter = my_workplace.hungry_animals.iter(); while let Some(ap) = animal_pens_iter.next() { match ap { zoo_bar::AnimalPen::Tigers => { me.go_feed_tigers(&mut raw_meat_that_tigers_like_stock).await?; } zoo_bar::AnimalPen::Elephants => { me.go_feed_elephants(&mut peanut_stock).await?; } } }
  I update or upgrade the zoo-bar dependency and there's a new enum variant of AnimalPens called Monkeys.
  Great! I get a compile error and I update my code to feed the monkeys.
  diff --git a/src/main.rs b/src/main.rs index 202c10c..425d649 100644 --- a/src/main.rs +++ b/src/main.rs @@ -10,5 +10,8 @@ zoo_bar::AnimalPen::Elephants => { me.go_feed_elephants(&mut peanut_stock).await?; } + zoo_bar::AnimalPen::Monkeys => { + me.go_feed_monkeys(&mut banana_stock).await?; + } } }
  Now let's say instead that the AnimalPen enum was marked non-exhaustive.
  So I'm forced to have a default match arm. In this alternate universe I start off with:
  let my_workplace = zoo_bar::ZooBar::new(); let mut animal_pens_iter = my_workplace.hungry_animals.iter(); while let Some(ap) = animal_pens_iter.next() { match ap { zoo_bar::AnimalPen::Tigers => { me.go_feed_tigers(&mut raw_meat_that_tigers_like_stock).await?; } zoo_bar::AnimalPen::Elephants => { me.go_feed_elephants(&mut peanut_stock).await?; } _ => { eprintln!("Whoops! I sure hope someone notices this default match in the logs and goes and updates the code."); } } }
  When the monkeys are added, and I update or upgrade the dependency on zoo-bar, I don't notice the warning in the logs right away after we deploy to prod. Because the logs contain too many things no one can go and read everything.
  One week passes and then we have a monkey starving incident at work.
  After careful review we realize that it was due to the default match arm and we forgot to update our program.
  So we learn from the terrible catastrophe with the monkeys and I update my code using the attributes from your link.
  diff --git a/src/main.rs b/src/main.rs index e01fcd1..aab0112 100644 --- a/wp/src/main.rs +++ b/wp/src/main.rs @@ -1,3 +1,5 @@ +#![feature(non_exhaustive_omitted_patterns_lint)] + use std::error::Error; #[tokio::main] @@ -11,6 +13,7 @@ async fn main() -> anyhow::Result<()> { let mut animal_pens_iter = my_workplace.hungry_animals.iter(); while let Some(ap) = animal_pens_iter.next() { + #[warn(non_exhaustive_omitted_patterns)] match ap { zoo_bar::AnimalPen::Tigers => { me.go_feed_tigers(&mut raw_meat_that_tigers_like_stock).await?; @@ -18,8 +21,12 @@ async fn main() -> anyhow::Result<()> { zoo_bar::AnimalPen::Elephants => { me.go_feed_elephants(&mut peanut_stock).await?; } + zoo_bar::AnimalPen::Monkeys => { + // Our monkeys died before we started using proper attributes. If they are hungry it means they have turned into zombies :O + me.alert_authorities_about_potential_outbreak_of_zombie_monkeys().await?; + } _ => { - eprintln!("Whoops! I sure hope someone notices this default match in the logs and goes and updates the code."); + unreachable!("We have an attribute that is supposed to tell us if there were any unmatched new variants."); } } }
  And next time we update or upgrade the crate version to latest, another new variant exists, but thanks to your tip we get a lint warning and we happily update our code so that we won't have more starving animals.
  diff --git a/wp/src/main.rs b/wp/src/main.rs index aab0112..4fc4041 100644 --- a/wp/src/main.rs +++ b/wp/src/main.rs @@ -25,6 +25,9 @@ async fn main() -> anyhow::Result<()> { // Our monkeys died before we started using proper attributes. If they are hungry it means they have turned into zombies :O me.alert_authorities_about_potential_outbreak_of_zombie_monkeys().await?; } + zoo_bar::AnimalPen::Capybaras => { + me.go_feed_capybaras(&mut whatever_the_heck_capybaras_eat_stock).await?; + } _ => { unreachable!("We have an attribute that is supposed to tell us if there were any unmatched new variants."); }
  But what was the advantage of marking the enum as #[non_exhaustive] in the first place?
  
  ffminus 4 months ago
  
  It lets you have a middle ground, with the decision of when breaking happens left up to library users. Without non_exhaustive, all consumers always get your second scenario. With non_exhaustive, individual zoos get to pick their own policy of when/if animals should starve.
  Each option has its place, it depends on context. Does the creator of the type want/need strictness from all their consumers, or can this call be left up to each consumer to make? The lint puts strictness back on the table as an opt-in for individual users.
  
  saagarjha 4 months ago
  
  Swift does this with unknown default.
  
  kelnos 4 months ago
  
  Consider a bit of a different case. I run a service that exposes an API, and some fields in some response bodies are enums. I've published a Rust client for the API for my customers to do, and (among other things) it has something like this:
  #[derive(serde::Serialize, serde::Deserialize)] pub struct SomeEnum { AValue, BValue, }
  My customers use that and all is well. But I want to add a new enum value, CValue. I can't require that all my customers update their version of my Rust client before I add it; that would be unreasonable.
  So I add it, and what happens? Well, now whenever my customers make that API call, instead of getting some API object back, they get a deserialization error, because that enum's Deserialize impl doesn't know how to handle "CValue". Maybe some customer wasn't even using that field in the returned API object, but now I've broken their code.
  Adding #[non_exhaustive] means I at least won't break my customers' code when I add a new enum value.
  
  sophacles 4 months ago
  
  It's really nice when doing networking protocols and other binary formats. Lots of things are defined as "This byte signifies X : 0 == Undefined, 1 == A, 2 == B, 3 == C, 4-127 == reserved for future use, 128-255 vendor specific options".
  This allows you to do something like:
  #[derive(Clone, Copy)] #[repr(u8)] #[non_exhaustive] pub enum Foo { A = 1, B, C, } impl Foo { pub fn from_byte(val: u8) -> Self { unsafe { std::mem::transmute(val) } } pub fn from_byte_ref(val: &u8) -> &Self { unsafe { std::mem::transmute(val) } } } #[cfg(test)] mod tests { use super::*; #[test] fn conversion_copy() { let n: u8 = 1; let y = Foo::from_byte(n); assert!(matches!(y, Foo::A)); let n: u8 = 4; let y = Foo::from_byte(n); assert!(!matches!(y, Foo::A) && !matches!(y, Foo::B) && !matches!(y, Foo::C)); let n2 = y as u8; assert_eq!(n2, 4); } #[test] fn conversion_ref() { let n: u8 = 1; let y = Foo::from_byte_ref(&n); assert!(matches!(*y, Foo::A)); let n: u8 = 4; let y = Foo::from_byte_ref(&n); assert!(!matches!(*y, Foo::A) && !matches!(*y, Foo::B) && !matches!(*y, Foo::C)); let n2 = (*y) as u8; assert_eq!(n2, 4); } }
  This lets you have a simple fast parsing of types without needing a bunch of logic - particularly in the ref example. Someone else sent you data over the wire and is using a vendor defined value, or a newer version of the protocol that defines Foo::D? No big deal, you can igore it or error, or whatever else is appropriate for your case.
  If you want to define Reserved and Vendor as enum attributes, now you have to have logic that runs all the time - and if you want to preserve the original value for error messages, logs, etc - you can't Repr(u8) and take up more memory, have to do copies, etc.
  #[non_exhaustive] pub enum Foo { Undefined =0, A = 1, B, C, Reserved(u8), Vendor(u8), } impl Foo { pub fn from_byte(val: u8) -> Self { match val { 0 => Foo::Undefined, 1 => Foo::A, 2 => Foo::B, 3 => Foo::C, 4..=127 => Foo::Reserved(val) 128.. => Foo::Vendor(val) } } }
  You also need logic to convert back to a u8 now too.
  It's not strictly necessary, but it certainly makes some things far more ergonomic.
  
  sophacles 4 months ago
  
  Looking at my code that works on this stuff - the above is just wrong. I was looking at my failed experimental branch not the actual code that does this. The above is a fun way to introduce all sorts of UB.
  Apologies for my pre-coffee brainfarts.
  
  codetrotter 4 months ago
  
  How does the working code look?
  
  tialaramex 4 months ago
  
  Importantly #[non_exhaustive] applies to your users but not you. In the defining crate we can write exhaustive matches and those work - the rationale is that we defined this type, so we should know how to do this properly. Our users however must assume they don't know if it has been extended in a newer version.
  #[non_exhaustive] is most popular for the variants of an enumeration but is permissible for published structure types (it means we promise these published fields will exist but maybe we will add more and thus change the size of the structure overall) and for the variants of a sum type (it means the inner details of that variant may change, you can pattern match it but we might add more fields and your matches must cope)
  
  3836293648 4 months ago
  
  Wait what. I thought it existed for FFI purposes, regardless of if that's with C or network protocols. The defining crate getting away with it undermines this.
  
  tialaramex 4 months ago
  
  No. If you mean "I don't know" that's not non_exhaustive that's "I don't know".
  For a network protocol or C FFI you probably want a primitive integer type not any of Rust's fancier types such as enum, because while you might believe this byte should have one of six values, 0x01 through 0x06 maybe somebody decided the top bit is a flag now, so 0x83 is "the same" as 0x03 but with a flag set.
  Trying to unsafely transmute things from arbitrary blobs of data to a Rust type is likely to end in tears, this attribute does not fix that.
sunshowers 4 months ago

There is currently a missing middle ground in stable Rust, which is to lint on a missing variant rather than fail compilation. There's an unstable option for it, but it would be very useful for non-exhaustive enums where consumers care about matching against every known variant.
You can practically use it today by gating on a nightly-only cfg flag. See https://github.com/guppy-rs/guppy/blob/fa61210b67bea233de52c... and https://github.com/guppy-rs/guppy/blob/fa61210b67bea233de52c...
- eru 4 months ago
  
  Couldn't clippy do that for you?
  
  sunshowers 4 months ago
  
  Not at the moment. The unstable lint is implemented in rustc directly, not in clippy, though I guess it could move to clippy in the future.
rendaw 4 months ago

I absolutely _hate_ this. Since you're forced to add a default case, if a new field is added in the future that you need to actively handle it won't turn into a compile error _or_ surface as a runtime error.
I think half of it is developers presuming to know users' needs and making decisions for them (users can make that decision by themselves, using the default case!) but also a logic-defying fear of build breakage, to the point that I've seen developers turn other compile errors into runtime errors in order to avoid "breaking changes".
- Spivak 4 months ago
  
  https://news.ycombinator.com/item?id=43237013
  You have to opt into it but it's nice that it's available.
  
  rendaw 4 months ago
  
  Oh nice! That seems so backwards, but hey, if it works...
- bobbylarrybobby 4 months ago
  
  I agree, this is the one place where upstream crates should be allowed to make breaking changes for downstream users. As a consumer of another crate’s enum, it's easy to enough opt into “never break my code” by just adding default cases, but I'd like to have to opt into that so that I'm notified when new variants are added upstream. Maybe this should even be a Cargo.toml setting — when an upstream crate is marked non-exhaustive, the downstream consumer gets to choose: require me to add default cases (and don't mark them as dead code), or let me exhaustively match anyway, knowing my match statement might break in the future.
- BlackFly 4 months ago
  
  It is all about what constitutes a major version bump and what constitutes the public api.
  If I have a parameter in my public API that has enumerated options, I should be able to add a new option without needing to bump my semver major version number since downstream existing code obviously isn't going to use it yet. If downstream was using my public api's enum for some of their own book keeping and so matched on my enum, I want to reserve the right to to say that that is non-public use of my enum, hence the idea that exhaustiveness in enums is a separate decision on to what is included in a public API or not.
  On the other hand, if I introduce a new variant in a return value and existing code will get it and need to actually do something with it, then it should probably be breaking. Errors are somewhat of an exception to this since almost all error enumerations need a general, "unknown error" category anyways and introducing a new variant is generally elevating one case out of that general case. Obviously authors can make mistakes.
  The alternative, when you cannot mark non_exhaustive, is to introduce stringly typed catch alls, which is much less desirable for everyone.
- michaeljsmith 4 months ago
  
  Not sure about Rust, but Typescript allows you to have the default handling but still flag a compile error if a new field is added (the first is useful e.g. if a separate component is updated and starts sending new values).
  https://stackoverflow.com/a/39419171/974188
- bmoxb 4 months ago
  
  It arguably makes sense in a large monorepo, but otherwise I would agree.
hchja 4 months ago

This is why language syntax is so important.
Swift allows a ‘default’ enum case which is similar to other but you should use it with caution.
It’s better to not use it unless you’re 110% sure that there will not be additional enums added in the future.
Otherwise, in Swift when you add an additional enum case, the code where you use the enum will not work unless you handle each enum occurrence at it’s respective call site.
- layer8 4 months ago
  
  The better solution is to have two different “default” cases in the language, one that expresses handling “future” values (values that aren’t currently defined), and one that expresses “the rest of the currently defined values”. The “future” case wouldn’t be considered for exhaustiveness checks.
  
  mayoff 4 months ago
  
  Swift allows an enum to be marked `@frozen`, which is an API (and ABI) stability guarantee that the enum will never gain more cases. Apple uses this quite sparingly in their APIs.
  Swift also has two versions of a `default` case in switch statements, like you described. It has regular `default` and it has `@unknown default`. The `@unknown default` case is specifically for use with non-frozen enums, and gives a warning if you haven't handled all known cases.
  So with `@unknown default`, the compiler tells you if you haven't been exhaustive (vs. the current API), but doesn't complain that your `@unknown default` case is unreachable.
  
  layer8 4 months ago
  
  Ah, thanks, I wasn’t aware of these two “default” variants in Swift.
  
  SkiFire13 4 months ago
  
  What would the "future" default case actually do though? When you're in the past there's no value for it, and the moment you get to the future the values will become part of the "present" and will still not fall under the "future" case. You would need some kind of versioning support in the enum itself, but that's a much bigger change.
  
  layer8 4 months ago
  
  “Future” values only become defined (“present” in your sense) at compile-time, but may occur before that at runtime. Note that this mostly presumes a language with separate compilation, or situations like coding against a remote-API spec, where the server may deploy a newer version but your client remains unchanged. Once you compile against the new spec, you’d get errors/warnings about the new, not explicitly handled values, but your existing binary would nevertheless handle those values under the “future” case.
  The issue with traditional “default” cases is that they shadow warnings/errors about unhandled cases, but you’d still want to have some form of default case for forward compatibility.
  
  eru 4 months ago
  
  > “Future” values only become defined (“present” in your sense) at compile-time, but may occur before that at runtime. Note that this mostly presumes a language with separate compilation, [...]
  Separate compilation is a technical implementation detail that shouldn't have an impact on semantics. Especially since LTO (link time optimisation) is becoming more and more common; 'thin' LTO is essentially free in Rust at least in terms of extra build time. LTO blurs the lines between separate compilation units.
  On the flip side, Rust can use multiple codegen units even for the same crate, thus introducing separate compilation where a naive approach, like in classic C, would only use a single one.
  
  layer8 4 months ago
  
  Separate compilation is relevant, because it means the version of the interface you compile against may not be the same version you run against. This is fine if the newer version is compatible with the older version. And for the present discussion, we consider an added enum value to not constitute a compatibility break. Nevertheless, it means that the client code can now receive a value that it couldn’t receive before. And it’s useful to be able to define a case distinction for such unknown future values, while at the same time having the compiler check that all currently defined values have been duly considered.
  In other words, you want to ensure that you have the most appropriate behavior for whatever values are currently known, and a fallback behavior for the future values that by definition you can’t possibly know at the present time. Of course, this is more or less only practical in languages where the interface version you compile against is only updated deliberately, while the implementation version at runtime can be any newer compatible version.
kpcyrd 4 months ago
It's still a gotcha in Rust, I've seen code like:
```
  #[non_exhaustive]
  pub enum Protocol {
    Tcp,
    Udp,
    Other(u16),
  }
```
It allows you to still match on the unrecognized case (like `Protocol::Other(1)`, which is nice), but an additional enum variant may eliminate that case, if our enum gets extended to:
```
  #[non_exhaustive]
  pub enum Protocol {
    Tcp,
    Udp,
    Icmp,
    Other(u16),
  }
```
Even though we can add additional variants in a semver-nonbreaking way due to `#[non_exhaustive]`, other people's code may now be broken until they've changed `Protocol::Other(1)` to `Protocol::Icmp`.
Having had this in the back of my head for quite some time, I think instead of an `Other` case there should be two methods, one returns an `Option<Protocol>` and the other one returns the `u16` representation. Unless there's a match on one of your expected cases your default branch would inspect the raw numeric type, which would keep working even if that case is added to the enum.
- remram 4 months ago
  
  You can't use non_exhaustive and Other. If you have an Other then it's exhaustive. This design is wrong.
ghfhghg 4 months ago

Languages like haxe simply won't compile if you don't cover every enum value in a switch case. Would that not be preferable? I quite like that feature.
F# I believe is similar wrt discriminated unions and pattern matching
- nindalf 4 months ago
  
  I think you misunderstood.
  By default Rust expects you to handle every enum variant. Not doing so would be a compile error.
  An example - my library exposes enum Colour with 3 variants - Red, Blue Green. In your application code you `match` on all 3. So far so good. But now if I add a 4th colour to my enum, your code will no longer compile because you are no longer handling every enum variant. This is a crappy experience for the user of the library.
  Instead, the library writer can make their intent clear - with the #[non_exhaustive] attribute. On such an enum it's not enough to handle the 3 colours of the enum, you must add a wildcard matcher that matches any variants added in future. This gives the library writer flexibility to make changes, while protecting the application developer from breakage.
  
  ghfhghg 4 months ago
  
  Oh I see. I did indeed misunderstand.
  Thanks for taking the time to explain!
airstrike 4 months ago

TIL
https://doc.rust-lang.org/reference/attributes/type_system.h...

zdw 4 months ago

I wonder how this aligns with the protobuf best practice of having the first value be UNSPECIFIED:

https://protobuf.dev/best-practices/dos-donts/#unspecified-e...

bocahtie 4 months ago

When the deserializing half of the protobuf definitions encounter an unknown value, it gets deserialized as the zero value. When that client updates, it will then be able to deserialize the new value appropriately (in this case, "Mint"). The advice on that page also specifies to not make the value semantically meaningful, which I take to mean to never set it to that value explicitly.
- chen_dev 4 months ago
  
  > it gets deserialized as the zero value
  It’s more complicated:
  https://protobuf.dev/programming-guides/enum/
  >> What happens when a program parses binary data that contains field 1 with the value 2?
  >- Open enums will parse the value 2 and store it directly in the field. Accessor will report the field as being set and will return something that represents 2.
  >- Closed enums will parse the value 2 and store it in the message’s unknown field set. Accessors will report the field as being unset and will return the enum’s default value.
  
  vitus 4 months ago
  
  Ugh. I hate how we (Google) launched proto editions.
  It used to be that we broadly had two sets of semantics (modulo additional customizations): proto2 and proto3. Proto editions was supposed to unify the two versions, but instead now we have the option to mix and match all of the quirks of each of the versions.
  And, to make matters worse, you also have language-dependent implementations that don't conform to the spec (in fact, very few implementations are conformant). C++ and Java treat everything imported by a proto2 file as closed; C#, Golang, and JS treat everything as open.
  I don't see a path forward for removing these custom deprecated field features, or else we'd have already begun that effort during the initial adoption of editions.
- dwattttt 4 months ago
  
  > The advice on that page also specifies to not make the value semantically meaningful, which I take to mean to never set it to that value explicitly.
  I've taken to coding my C enums with the first value being "Invalid", indicating it is never intended to be created. If one is encountered, it's a bug.
jmole 4 months ago

The example code used added “other” as the last option, which was the source of the problems he described.
This doesn’t happen when you make the first value in the enum unknown/unspecified
- plorkyeran 4 months ago
  
  No, the problem described in the article is entirely unrelated to where in the enum the Other option is located. There is a different problem where keeping the Other option at the end of the enum changes the value of Other, but that is not the problem that the article is about.
  
  jmole 4 months ago
  
  Well it simplifies the logic considerably - if you see an enum value you don’t recognize (mint), you treat it as uninitialized (0).
  So any future new flavor will be read back as ‘0’ in older versions.
seeknotfind 4 months ago

This is the same as a null pointer, and the requirement is very deeply tied to protobuf as it is used on large distributed systems that always need to handle version mismatch, and this advice doesn't necessarily apply to API design in general.
- eddd-ddde 4 months ago
  
  Even in the simplest web apps you can encounter version mismatch when a client requests a response from a server that just updated.
  
  seeknotfind 4 months ago
  
  This implies an API where the server has a single shared implementation. Imagine for instance that the server implements a shim for each version of the interface, then there isn't a need for the null in the API. Imagine another alternative, that the same API never adds a field, but you add a new method which takes the new type. Imagine yet again an API where you are able to version the clients in lockstep. So, it's a decision about how the API is used and evolves that recommends the API encoding or having a null default. However in a different environment or with different practices, you can avoid the null. Of course the reason to avoid the null is so that you can statically enforce this value is provided in new clients, though this also assumes your client language is typed. So in the end, protobuf teaches us, but it's not always the best in every situation.
  
  hansvm 4 months ago
  
  Hence the advice to make that situation not happen. Update the client and server to support both versions and prefer the new one, then update both to not support the old version. With load balancers and other real-world problems you might have to break that down into 4 coordinated steps.
  
  Joker_vD 4 months ago
  
  That only really works if you control the clients, or can force them to update.
  
  LoganDark 4 months ago
  
  > or can force them to update.
  I've used a few clients that completely lock me out for every tiniest minor version update. Very top-tier annoying imho.
  
  eru 4 months ago
  
  But it does make the authors' jobs easier.
MarkMarine 4 months ago

I don’t mind the zero value for the proto enums, makes sense, but I require converting to my inner logic to not include this “unknown” and error during the conversion if it fails.
I’ve seen engineers bring those unknowns or unspecified through to the business logic and that always made my face flush red with anger.
- fmbb 4 months ago
  
  Why the anger?
  If you are consuming data from some other system you have no power over what to require from users. You will have data points with unknown properties.
  Say you are tracking sign ups in some other system, and they collect the users’ browser in the process, and you want to see conversion rate per browser. If the browser could not be identified, you prefer it to say ”other” instead of ”unknown”?
  I think I prefer the protobuf best practices way: you have a 0 ”unknown”/”unset” value, and you enumerate the rest with a unique name (and number). The enum can be expanded in the future so your code must be prepared for unknown enumerated values tagged with the new (future for your code) number. They are all unique, you just don’t yet know the name of some of the enum values.
  You can choose to not consume them until your code is updated with a more recent schema. Or you can reconcile later, annotating with the name of you need it.
  Now personally, I would not pick an enum for any set och things that is not closed when you are designing. But I’m starting to think that such sets hardly exist in the real world. Humans redefine everything over time.
- crabbone 4 months ago
  
  I wrote my own Protobuf implementation (well, with some changes). Ditching the default values was one of the changes I made. I don't see any reason to have that. But I don't think that Protobuf is a reasonable or even decent protocol in general. It has a lot of nonsense and bad planning. Having default values is probably not in the ten worst things about Protobuf.
beart 4 months ago

"Unspecified" is semantically different from "other". The former is more like a default value whereas the latter is actually "specified, but not one of these listed options".
- hamandcheese 4 months ago
  
  Standard practice in protobuf is to never assign semantic meaning to the default value. I think some linters enforce that enum 0 is named "unknown" which is actually more semantically correct than "other" or "unspecified".

NoboruWataya 4 months ago

> Just document that the enumeration is open-ended, and programs should treat any unrecognized values as if they were “Other”.

Possibly just showing my lack of knowledge here but are open-ended enumerations a common thing? I always thought the whole point of an enum is that it is closed-ended?

sd9 4 months ago

I’ve worked on systems which where the set of enum values was fixed at any particular point in time, but could change over time as business requirements changed.
For instance, we had an enum that represented a sport that we supported. Initially we supported some sports (say FOOTBALL and ICE_HOCKEY), and over time we added support for other sports, so the enum had to be expanded.
Unfortunately this always required the entire estate to be redeployed. Thankfully this didn’t happen often.
At great expense, we eventually converted this and other enums to “open-ended” enums (essentially Strings with a bit more structure around them, so that you could operate on them as if they were “real” enums). This made upgrades significantly easier.
Now, whether those things should have been enums in the first place is open for debate. But that decision had been made long before I joined the team.
Another example is gender. Initially an enum might represent MALE, FEMALE, UNKNOWN. But over time you might decide you have need for other values: PREFER_NOT_TO_SAY, OTHER, etc.
hansvm 4 months ago

It's common when mixing many executables over time.
I prefer to interpret those as an optional/nullable _closed_ enum (or, situationally, a parse error) if I have to switch on them and let ordinary language conventions guide my code rather than having to understand some sort of pseudo-null without language support.
In something like A/B tests it's not uncommon to have something that's effectively runtime reflection on enum fields too. Your code has one or more enums of experiments you support. The UI for scaling up and down is aware of all of those. Those two executables have to be kept in sync somehow. A common solution is for the UI to treat everything as strings with weights attached and for the parsers/serializers in your application code to handle that via some scheme or another (usually handling it poorly when people scale up experiments that no longer exist in your code). The UI though is definitely open-ended as it interprets that enum data, and the only question is how it's represented internally.
XorNot 4 months ago

The first time you have to add a new schema value, you'll realise you needed "unknown" or similar - because during an upgrade your old systems need a way to deal with new values (or during a rollback you need to handle new entries in the database).
- sitkack 4 months ago
  
  Your comment is the only in the entire discussion that mentions "schema". Having an "other" in a schema is a way to ensure you can run n and n+1 versions at the same time.
  It is Data Model design, of which API design a subset.
  You can only ever avoid having an other if 1) your schema is fixed and 2) if it is total over the universe of values.
furyofantares 4 months ago
This is not really the case mentioned (not API design), but I somewhat often have an enum that is likely to be added to, but rarely (lots of code will have been written in the meantime) and I would like to update all the sites using it, or at least review them. Typically it looks something like this:
```
    enum WidgetFlavor
    {
        Vanilla,
        Chocolate,
        Strawberry,
    
        NumWidgetFlavors
    };
```
And then wherever I have switch(widgetFlavor), include static_assert(NumWidgetFlavors==4). A bit jealous of rust's exhaustive enums/matches.
int_19h 4 months ago

Both are valid depending on what you're modelling.
As far as programming languages go, all enums are explicitly open-ended in C, C++, and C#, at least, because casting an integer (of the underlying type) to enum is a valid operation.
- jay_kyburz 4 months ago
  
  My pet hate is when folks start doing math on enums or assuming ranges of values within an enum have meaning.
  
  DonHopkins 4 months ago
  
  Like pesky Hex<=>Decimal conversion with the gap between the numbers and the letters, and upper/lower case letters too.
- eru 4 months ago
  
  Yeah, C, C++ (and C#) aren't very good at modelling data structures.
fweimer 4 months ago

Enumerations are open-ended in C and C++. They are just integer types with some extra support for defining constants (although later C++ versions give more control over the available operations).
gauge_field 4 months ago

Sometimes, one case where I made use of this is enumeration of uarch for different hardware to read from the host machine. The update for for new uarch type is closed ended until there is new cpu with new uarch, which is long time. So, for a very long time it is open-ended with very low velocity in change. It is ideal for enums (for a very long time), but you still need to support the change in list of enum variants to not break semver.
tbrownaw 4 months ago

Does a foreign key count as an enum type?

kstenerud 4 months ago

I use the "other" technique when it's necessary for the user to be able to mix in their own:

    enum WidgetFlavor
    {
        Vanilla,
        Chocolate,
        Strawberry,
        Other=10000,
    };

Now users can add their own (and are also responsible for making sure it works in all APIs):

    enum CustomWidgetFlavor
    {
        RockyRoad=Other,
        GroovyGrape,
        Cola,
    };

And now you can amend the enum without breaking the client:

    enum WidgetFlavor
    {
        Vanilla,
        Chocolate,
        Strawberry,
        Mint,
        Other=10000,
    };

qingcharles 4 months ago

It's code like this that ends in a terminal choco-banana shake hang:
http://www.technofileonline.com/texts/chocobanana.gif
- fingerlocks 4 months ago
  
  What is the context here? Is this just a silly nonsense tech support page/meme or an actual product from the late 90s?
  
  qingcharles 4 months ago
  
  It's a real Knowledge Base article from Microsoft. I can't find the original, but it is on archive.org somewhere if you dig around.
  
  sandblast 4 months ago
  
  Is the context not clear for you from the information that the article applies to "DreamWorks Interactive, Someone's in the Kitchen, version 1.0" and other clues that it's a game?
dominicrose 4 months ago

"Other" doesn't mean the same thing before and after Cola has been added. "Unknown" would be more accurate.
My personal opinion would be to make the enum nullable and not add a fake value.
- kstenerud 4 months ago
  
  It's not about "other" having a meaning unto itself; rather it's about having a placeholder in the enum "space" (for older languages that implement it as an integer) so that customers can introduce their own variants. My library would then happily accept any value passed in, but only act upon the enum values that the library itself specifies. The customer would be entirely responsible for making sure their own variants behave well.
  This is not a common thing to do, of course, but when your customers are clamoring for what would actually be a useful feature for your product, even this somewhat ugly hack is a lot better than saying "no".
- mewpmewp2 4 months ago
  
  But null value seems to have the intent that it doesn't have category at all.
  So, let's say there's a function createFruit(fruitType: FruitTypeEnum);
  If it's null, it seems wrong since it seems to mean that you have a fruit without type.
  If it's unknown, then it might also be incorrect, since you very well know the type, it just isn't handled there.
  So I'm wondering if best might be something like Unhandled, Unspecified or Unlisted.
  
  dominicrose 4 months ago
  
  I guess if you know the type but it's not in the enum, then you can save it somewhere else. If the enum changes afterwards, that data can be used to update the enum value.

layer8 4 months ago

Slight counterpoint: Unless there is some guarantee that the respective enum type will never ever be extended with a new value, each and every case distinction on an enum value needs to consider the case of receiving an unexpected value (like Mint in the example). When case distinctions do adhere to that principle, then the problem described doesn’t arise.

On the other hand, if the above principle is adhered to as it should, then there is also little benefit in having an Other value. One minor conceivable benefit is that intermediate code can map unsupported values to Other in order to simplify logic in lower-level code. But I agree that it’s usually better to not have it.

A somewhat related topic that comes to mind is error codes. There is a common pattern, used for example by the HTTP status codes, where error codes are organized into categories by using different prefixes. For example in a five-digit error code scheme, the first three digits might indicate the category (e.g. 123 for “authentication errors”), and the remaining two digits represent a more specific error condition in that category. In that setup, the all-zeros code in each category represents a generic error for that category (i.e. 12300 would be “generic authentication error”).

When implementing code that detects a new error situation not covered by the existing specific error codes, the implementer has now the choice of either introducing a new error code (e.g. 12366 — this is analogous to adding a new enum value), which has to be documented and maybe its message text be localized, or else using the generic error code of the appropriate category.

In any case, when error-processing code receives an unknown — maybe newly assigned — error code, they can still map it according to the category. For example, if the above 12366 is unknown, it can be handled like 12300 (e.g. for the purpose of mapping it to a corresponding error message). This is quite similar to the case of having an Other enum value, but with a better justification.

qbane 4 months ago

How about putting Other at the top? You can convince yourself that the value zero (or one if you like) is reserved for unknown values.

shakna 4 months ago

This is what I tend to do. Because 0 is "default", it means "unspecified" in a lot of my API designs.
Cthulhu_ 4 months ago

That's the Go approach, where every value is zeroed so it makes sense for enum values to have a 'none' or 'other' or 'unknown' value as the first value.
(note that Go doesn't have enums as a language feature, but you can use its const declaration to create enum-like constants)

dataflow 4 months ago

I think there are multiple concerns here, and they need to be analyzed separately -- they don't converge to the same solution:

- Naming: "Other" should probably be called "Unrecognized" in these situations. Then users understand that members may not be mutually exclusive.

- ABI: If you need ABI compatibility, the constraint you have is "don't change the meanings of values or members", which is somewhat stronger. The practical implication is that if you do need to have an Other value, its value should be something out of range of possible future values.

- Protocol updates: If you can atomically update all the places where the enum is used, then there's no inherent need to avoid Other values. Instead, you can use compile-time techniques (exhaustive switch statements, compiler warnings, temporarily removing the Other member, grep, clang-query, etc.) to find and update the usage sites at compile time. This requires being a little disciplined in how you use the enum during development, but it's doable.

- Distributed code: If you don't have control over all the code using your enum might, then you must avoid an Other value, unless you can somehow ensure out-of-band that users have updated their code.

coin 4 months ago

Just call it "unknown" or "unspecified" or better yet use an optional to hold the enum.

101011 4 months ago

This ended up being the preferred pattern we moved into.
If, like us, you were passing the object between two applications, the owning API would serialize the enum value as a String value, then we had a client helper method that would parse the string value into an Optional enum value.
If the original service started transferring a new String object between services, it wouldn't break any downstream clients, because the clients would just end up with Optional empty
- janci 4 months ago
  
  How that works when you need to distinguish between "no value provided" and "a value that is not in the list" - in some applications they have different semantics.
  
  101011 4 months ago
  
  You could treat it as an nullable Option<SomeType>.
  In practice, as it relates to enums, I don't usually see 'no value provided' as a frequently used case - it's more likely that 'no value provided' maps to a more informative 'enum' value

KPGv2 4 months ago

> Rust has the "non_exhaustive" attribute that lets you declare that an enum might get more fields in the future.a

Is there a reason, aside from documentation, that this is ever desirable? I rarely program in Rust, but why would this ever be useful in practice, outside of documentation? (Seems like code-as-documentation gone awry when your code is doing nothing but making a statement about future code possibilities)

LegionMammal978 4 months ago

Normally, when you match on the value of an enum, Rust forces you to either add a case for every possible variant, or add a default arm "_ => ..." that acts as a 'none of the above' case. This is called exhaustiveness checking [0].
When you add #[non_exhaustive] to an enum, the compiler says to external users, "You're no longer allowed to just match every existing variant. You must always have a default 'none of the above' case when you're matching on this enum."
This lets you add more variants in the future without breaking the API for existing users, since they all have a 'none of the above' case for the new variants to fall into.
[0] https://doc.rust-lang.org/book/ch06-02-match.html#matches-ar...
jeroenhd 4 months ago

If your library processes data from another language, you'll probably need to deal with the possibility that the library returns open ended enums.
I believe I've also seen this declaration for generated bindings for a JSON API that promises backwards compatibility for calls and basic functionality at least. Future versions may include more options, but the code will still compile fine against the older API.
I don't think it's a great tool to use everywhere, but there are edge cases where Rust's demand for exhaustive matches conflicts with the non-Rust world, and that's where stuff like this becomes hard to avoid.

jffhn 4 months ago

>"programs should treat any unrecognized values as if they were “Other”"

Having such an "Other" value does not prevent from considering that the enum is open-ended, and it simplifies a lot all the code that has to deal with potentially invalid or unknown values (no need for a validity flag or null).

That's probably why in DIS (Distributed Interactive Simulation) standard, which defines many enums, all start with OTHER, which has the value zero.

In STANAGs (NATO standards), the value zero is used for NO_STATEMENT, which can also be used when the actual value is in the enum but you can't or don't need to indicate it.

I remember an "architecture astronaut" who claimed that NO_STATEMENT was not a domain value, and removed it from all the enums in its application. That did not last long.

That also reminds me of Philippe Khan (Bordland) having in some presentation the ellipse extend the circle, to add a radius. A scientist said he would do the other way around, and Khan replied: "This is exactly the difference between research and industry".

ivan_gammel 4 months ago

>That also reminds me of Philippe Khan (Bordland) having in some presentation the ellipse extend the circle, to add a radius. A scientist said he would do the other way around, and Khan replied: "This is exactly the difference between research and industry".
My favorite question on interviews on the OOP topic. It can be correct either way or both can be wrong, so the good answer would be "It depends". When developers rush to give a specific answer, they do not demonstrate due attention to the domain and it may mean that they will assume thousand other falsehoods from those articles on Github.
jffhn 4 months ago

err that's Kahn

sylware 4 months ago

As another example: vulkan3D made the mistake to use enum in its API.

Now, they must be sure it is a signed 32bits on 32 or 64 bits systems, namely check the compiler behavior. You can check the code, they always add a 0x7fffffff as the last enum value to "force" the compiler and tell developers (which have enough experience) "hey, this is a signed 32bits"... whoopsie!

We should eat the bullet: remove the enum in vulkan3D, and use the appropriate primitive type for each platform ABI (not API...), so the "fix" should be transparent as it would no break the ABI. But all the "code generators" using khronos xml specifications and static source code are to be modified in one shot to stay consistent. This ain't small feat.

[NOTE: enum is one of those things which should be removed from the "legacy profile" of C (like tons of keywords, integer promotion, implicit cast, etc).]

esafak 4 months ago

Just add a free-form text field to hold the other value, and revise your enum as necessary, while migrating the data.

AceJohnny2 4 months ago

I can't even tell if you're trolling.
bilekas 4 months ago

As the other commenter mentioned, I hope this is an excersize in trolling. Please don't do this when using enums.
> Free-form text firle to hold the other value
> revise your enum as necessary, while migrating the dat
Both instances would defeat the purpose of an enum..
As for the original post, my 2cents are valued enums.
{
Other = 0, Vanilla = 1, Chocolate = 2, Strawberry = 3 }
In this case it allows some flexibility to add later while still being able to make use of 0 (Other). Or maybe I missed the OP's point ?
- esafak 4 months ago
  
  With a form you will have an empirical basis to determine what the missing enum values should be, and the user won't enter any text unless necessary.
  
  bilekas 4 months ago
  
  > and the user won't enter any text unless necessary
  You are underestimating users' ability to ignore non strict constraints. Also when I can, why would I not increase my compile time checking ? There are very good reasons to use Enums, you seem to just want to ignore those reasons.
  
  esafak 4 months ago
  
  I'm not ignoring enumerations. I'm supplementing the existing options with a text field rather than shoehorning everything into an uninformative "other" value.
  I acknowledge your point about the user ignoring constraints.
  
  bilekas 4 months ago
  
  Infact you're shoe horning in your agenda to enums. Maybe vectorised strings make sense for AI as your start up is working on, but we are talking different reasons and definitely different use case scenarios.

jasonkester 4 months ago

This got me wondering what I actually do in practice. I think it's this:

  const KnownFlavors {
    Vanilla: "Vanilla",
    Chocolate: "Chocolate",
    Strawberry: "Strawberry"
  }

Then, use a string to hold the actual value.

  doug.favoriteFlavor = KnownFlavors.Chocolate;
  cindy.favoriteFlavor = "Mint"

  case: KnownFlavors.Chocolate:

Expand your list of known flavors whenever you like, your system will still always hold valid data. You get all the benefits of typo-proofing your code, switching on an enum, etc., without having to pile on any wackiness to fool your compiler or keep the data normalized.

It acknowledges the reality that a non-exhaustive enum isn’t really an enum. It’s just a list of things that people might type into that field.

Boldened15 4 months ago

Sorry I don't get the example, are both code blocks meant to be client-side code?
> It acknowledges the reality that a non-exhaustive enum isn’t really an enum. It’s just a list of things that people might type into that field.
I would say the opposite, the kinds of enums that map a case to a few hardcoded branches (SUCCESS, NETWORK_ERROR, API_ERROR) are often an approximation of algebraic data types which Rust implements as enums [0] but not most languages or data formats. Since often using those will require something like a `nullthrows($response->getNetworkError())` once you've matched the enum case.
The kind of enum that's just a string whitelist, like flavors or colors, which you can freely pass around and store, likely converting it into a human-readable string or RGB values in one or two utils, is the classic kind of enum to me.
[0] https://doc.rust-lang.org/std/keyword.enum.html
PartiallyTyped 4 months ago

The way we do this with AWS SDK in rust is by leveraging #non_exhaustive, and matching the (_@other) pattern, this is forward compatible and allows us to do something like ( _@other) if other.name() == “foo” for known cases without upgrading down the road or if the user uses an older version than our API.

akamoonknight 4 months ago

One of the tactics I end up using in Verilog, for better or worse, is to define enums with a'0 value (repeat 0s for the size of the variable), and '1 value (repeat 1s for the size of the value)

'0 stays as "null"-like (e.g INVALID), and '1 (which would be 0xFF in an 8 bit byte for instance) becomes "something, but I'm not sure what" (e.g. UNKNOWN).

Definitely has the same issues as referenced when needing to grow the variable, and the times where it's useful aren't super common, but I do feel like the general concept of an unknown-but-not-invalid value can help with tracking down errors in processing chains Definitely do run into the need to "beware" though with enums for sure.

o11c 4 months ago

The approach in the link is fine for consumers, but for producers you really do need some way of saying "create a value that's not one of the known values". Still, there's nothing that says this needs to be pretty.

shortrounddev2 4 months ago

On a similar note: what do you think is best practice for reserving memory in a struct for future usage? For example, if you have a binary file format with a header like this:

    struct Header
    {
        char waterMark[3];
        uint16_t width;
        uint16_t height;
        uint8_t reserved[16];
    }

So that you can future proof v1 binaries to still be compatible with v2 by adding empty padding on "reserved" which lets you add fields in the future. I do this sometimes and always wonder if there are other philosophies on it

mkleczek 4 months ago

I've had a short discussion with Brian Goetz about a similar case (sealed types in Java): https://mail.openjdk.org/pipermail/amber-dev/2020-April/0058...

I wonder when we are going to re-discover OOP style dynamic dispatch (or even better: multiple dispatch) to deal with software evolution.

oytis 4 months ago

Worth noting that in C and C++ enum-typed variable holding a value not in the enum is a UB. Had some funny bugs because of that.

sgondala_ycapp 4 months ago

Random tidbit: We use LLM to identify document types and use an enum to show a list of options.

Initially, we didn’t include an "Other" category - which led the LLM to force-fit documents into existing types even when they didn’t belong. Obv this wasn't LLM's fault.

We realized the mistake and added "Other". This significantly improved output accuracy!

spjt 4 months ago

I guess I just don't see much value in using enums at all. My most memorable experience with them is when our production system went down because a third-party service added a value to an enum.

bob1029 4 months ago

Making things into enums that shouldn't be enums is a fun trap to fall into. Much of the time what you really want is a complex type so that you can communicate these additional facts. In this case I'd do something like:

  class Widget 
  { 
    WidgetFlavor Flavor; //Undefined, Vanilla, Chocolate, Strawberry
    string? OtherFlavor;
  }

This is easy to work from a consumer standpoint because if you have a deviant flavor to specify, you don't bother setting the Flavor member to anything at all. You just set OtherFlavor. Fewer moving pieces == less chance for bad times.

The first (default) member in an enum should generally be something approximating "Undefined". This also makes working with serializers and databases easier.

IshKebab 4 months ago
This is not a good design. You've introduced representable invalid states (Flavor=Vanilla, Other flavor="DarkChocolate").
At the least you want this...
```
  enum Flavor {
    Chocolate,
    Banana,
    Strawberry,
    Other(String),
  }
```
But that's not right either. What you really want is
```
  #[non_exhaustive]
  enum Flavor {
    Chocolate,
    Banana,
    Strawberry,
  }

  impl ToString for Flavor ...
```
msy 4 months ago
Until you get a
```
  Widget.OtherFlavor = 'Vanilla'
```
ryanschaefer 4 months ago

> Fewer moving pieces == less chance for bad times.
Is this not a case for explicitly specifying all flavors? Other flavor has essentially introduced infinite moving pieces.

delduca 4 months ago

SDL does this trick https://wiki.libsdl.org/SDL2/SDL_EventType

hello12343214 4 months ago

Good idea. I appreciate that he thought through future compatibility with old versions.

moomin 4 months ago

Also Microsoft: your enum should have an explicit Unknown entry with value 0.

vadim_phystech 4 months ago

...since the set of all possible behaviour, that is not specified, it much greater, and densier, than one would initially feel and assume, one might cause lot's of possible bad outcomes and success-breaking-points if use "Other" type in their API. Because "Other" if the 1st thing to look for vulnerabilities, for attack vectors. Because the spirit of UB the Terrible lurks there! The spirit of UB feeds upon thee juices of "Other" omnimorphic (fel) type! скверный бесформенный "ЛЮБОЙ" тип! разврат и дисгармоничность! разложение и редуцирующие гетероморфизмы! decomposition, descriptive semantic matrix rank reduction, richness degradation, devolution...empoorness...scarcity pressure increase...

</shutting_the_fuck_up_my_wetware_machine_whispering_kek>

_3u10 4 months ago

I usually use Unknown / Other as 0.

1oooqooq 4 months ago

jr: add other option

sr: omit other option

illuminated: add other option in front end only and alert when the backend crashes.

DonHopkins 4 months ago

https://en.wikipedia.org/wiki/Tony_Hoare#Research_and_career

>Speaking at a software conference in 2009, Tony Hoare apologized for inventing the null reference, his "Billion Dollar Mistake":

>"I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years." -Tony Hoare

Anders Hejlsberg brilliantly points out how JavaScript doubled the cost of that mistake:

>"My favorite is always the Billion-Dollar Mistake of having null in the language. And since JavaScript has both null and undefined, it's the Two-Billion-Dollar Mistake." -Anders Hejlsberg

>"It is by far the most problematic part of language design. And it's a single value that -- ha ha ha ha -- that if only that wasn't there, imagine all the problems we wouldn't have, right? If type systems were designed that way. And some type systems are, and some type systems are getting there, but boy, trying to retrofit that on top of a type system that has null in the first place is quite an undertaking." -Anders Hejlsberg

The JavaScript Equality Table shows how Brendan Eich simply doesn't understand equality for either data types or human beings and their right to freely choose who they love and marry:

https://dorey.github.io/JavaScript-Equality-Table/

Do any languages implement the full Rumsfeld Awareness–Understanding Matrix Agnoiology, quadrupling the cost?

Why stop at null, when you can have both null and undefined? Throw in unknown, and you've got a hat trick, a holy trinity of nihilistic ignorance, nothingness, and void! The Rumsfeld Awareness–Understanding Matrix Agnoiology breaks knowledge down into known knows, plus the three different types of unknowns:

https://en.wikipedia.org/wiki/There_are_unknown_unknowns

>"Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns—the ones we don't know we don't know. And if one looks throughout the history of our country and other free countries, it is the latter category that tends to be the difficult ones." -Donald Rumsfeld

1) Known knowns: These are the things we know that we know. They represent the clear, confirmed knowledge that can be easily communicated and utilized in decision-making.

2) Known unknowns: These are the things we know we do not know. This category acknowledges the presence of uncertainties or gaps in our knowledge that are recognized and can be specifically identified.

3) Unknown knowns: Things we are not aware of but do understand or know implicitly

4) Unknown unknowns: These are the things we do not know we do not know. This category represents unforeseen challenges and surprises, indicating a deeper level of ignorance where we are unaware of our lack of knowledge.

https://en.wikipedia.org/wiki/Agnoiology

>Agnoiology (from the Greek ἀγνοέω, meaning ignorance) is the theoretical study of the quality and conditions of ignorance, and in particular of what can truly be considered "unknowable" (as distinct from "unknown"). The term was coined by James Frederick Ferrier, in his Institutes of Metaphysic (1854), as a foil to the theory of knowledge, or epistemology.

I don't know if you know, but Microsoft COM hinges on the IUnknown interface. Microsoft COM's IUnknown interface takes the Rumsfeldian principle to heart: it doesn't assume what an object is but provides a structured way to query for knowledge (or interfaces). In a way, it models known unknowns, since a caller knows that an interface might exist but must explicitly ask if it does.

Then there's Schulz's Known Nothing Nesiology, representing the existential conclusion of all this: when knowledge itself is questioned, where does that leave us? Right back at JavaScript's Equality Table, which remains an unfathomable unknown unknown to Brendan Eich and his well known but knowingly ignorant War on Equality.

https://www.youtube.com/watch?v=HblPucwN-m0

Nescience vs. Ignorance (on semantics and moral accountability):

https://cognitive-liberty.online/nescience-vs-ignorance/

>From a psycholinguistic vantage point, the term “ignorance” and the term “nescience” have very different semantic connotations. The term ignorance is more generally more widely colloquially utilized than the term nescience and it is often wrongly used in contexts where the word nescience would be appropriate. “Ignorance” is associated with “the act of ignoring”. Per contrast, “nescience” means “to not know” (viz., Latin prefix ne = not, and the verb scire = “to know”; cf. the etymology of the word “science”/prescience).

>As Mark Passio points out, the important underlying question which can be derived from this semantic distinction pertains to whether our individual and global problems are caused by “ignorance” or “nescience”? That is, “ignoring” or “not knowing”? It seems clear that it is the later. We know about the truth but we actively ignore it for the most part. Currently people have all the necessary information available (literally at their fingertips). Ignoring the facts is a decision, an irrational decision, and people can be held accountable for this decision. Nescience, on the other hand, acquits from accountability (i.e., someone cannot be held accountable when he/she for not knowing something but for ignoring something). Quasi-Freudian suppression plays a pivotal role in this scenario. Suppression is very costly in energetic terms. The energy and effort which is used for suppression lacks elsewhere (cf. prefrontal executive control is based on limited cognitive resources). The suppression of truth through the act of active ignoring thus has negative implications on multiple levels – on the individual and the societal level, the cognitive and the political, the psychological and the physiological.

Brendan: While we can measure the economic consequences of your culpably ignorant mistakes of both bad programming language design and marriage inequality in billions of dollars, the emotional, social, and moral costs of the latter -- like diminished human dignity and the perpetuation of discrimination -- are, by their very nature, priceless.

Ultimately, these deeper impacts underscore that the fight for marriage equality, defending against the offensive uninvited invasion of your War on Equality into other people's marriages, is about much more than economics; it’s about ensuring fairness, respect, and equality for all members of society.

hchja 4 months ago

[dead]

khana 4 months ago

[dead]