Oddbean new post about | logout
 When you find out that Bitcoin-Core devs just learned about memset being "insecure" by most compilers by default. 

https://github.com/bitcoin-core/secp256k1/compare/v0.5.1...v0.6.0

Noscrypt has used platform "secure" memset since it's inception:
https://git.vaughnnugent.com/cgit/vnuge/noscrypt.git/tree/src/nc-crypto.c#n197 
 wow

gonna be updating that for sure right away, i use that library 
 What is insecure about it? Also I have started looking at your Noscrypt. Sick work. Wish more people used it.

(Tempted to make python bindings) 
 Please do! I want to have support for many languages in the future 

The issues is that on my most *nix platforms the `memset(ptr, size);` call gets optimized away because there is no read after write. So it's considered insecure. Along with that both the cl.exe and cc compilers treat volatile argument pointers differently and they even disagree heavily about when to use volatile pointers either as arguments or even in the procedure body. Really the only right way is to look at the assembly after you compile your project to see if there are instructions to wipe the buffer correctly.  
 one of the many innovations of #golang over C

everything is wiped at allocation AND on free 
 ok tbh i'm not sure about at free... most golang signature and encryption implementations include a step at the end of stuff where they write zeroes to each byte/word of the memory of the objects

i think someone wrote a nice memfence thing also which triggers an interrupt when nearby memory is accessed 
 if they made the GC zero the memory at free that would be awesome... but i can see why it might not be... they should add a flag or something for sensitive values so they are definitely zeroed, that would put Go above all 
 Yeah I mean it's only useful in certain conditions like secrets. Clearing the stack with fencing is SLOW as fuck like 1000s of CPU cycles to write back to physical memory, often blocking other threads, and dealing with kernel preemption. That's a massive slowdown. Were willing to take the massive performance hit for secrets though.  
 Probably 100,000s of cpu cycles 
 That's wild.  
 oh yeah, my implementation using that library also

well, i haven't actually written much code that does signing

https://github.com/mleku/realy/blob/dev/p256k/secp256k1.go#L340

most of the ways in which i envision using it server side need the thing to stay live for the life of the app so i really need a memfence much more than this 
 I see. Are you sure that function doesn't get optimized away? Does GO have any compiler directives that you'd specify to guarantee that doesn't happen? Almost anytime you'd call that function in a C/C++/C#/PHP and likely many others, that call would get optimized for a no-read-after-write condition.  
 so if i made it check back that would dodge it... ok 
 like this:

```go
func Zero(sk *SecKey) {
	b := (*[96]byte)(unsafe.Pointer(sk))[:96]
	var tmp byte
	for i := range b {
		for tmp != 0 {
			b[i] = 0
			tmp = b[i]
		}
	}
}
```

this should make it check and compare at each step 
 ah oops i need to set tmp at nonzero to start 
 ```go
func Zero(sk *SecKey) {
	b := (*[96]byte)(unsafe.Pointer(sk))[:96]
	tmp := byte(1)
	for i := range b {
		for tmp != 0 {
			b[i] = 0
			tmp = b[i]
		}
	}
}
``` 
 i have an idea 
 forget zero, scramble 
 ```go
func Zero(sk *SecKey) {
	b := (*[96]byte)(unsafe.Pointer(sk))[:96]
	for i := range 8 {
		rand.Read(b)
	}
	// reverse the order and negate
	for i := range b {
		b[i] = ^b[len(b)-1-i]
	}
}
```

this should rewrite the bits 8 times with secure random bits and then reverse and negate them, i can't see how it could optimize that out 
 and yeah, i sorta have an idea about this... in Go, it caches tests with deterministic results, but tests that read from the crypto/rand library never do, neither do the ones even from pseudorandom generators like luke champine's frand, which uses chacha12/20 hash chains and seeds 
 Yeah many people write CSRNG data instead of zeroing. It's an option. Don't know the results of it's effectiveness. I've considered it in some instances.  
 it bypasses the problem of optimization 
 it also mitigates against flash memory shadows 
 I'm not sure it will do that. Depends on the controller I guess. Some flash controllers can randomize writes and mark an old block as stale for wear leveling 
 well, really the chances of a key falling into swap are nearly insignificant if the signing is being done frequently

but even stilll, i've put 8 rewrites on it, even if it optimizes out to only one actual write it's still effective 
 Yes and if it's that important you can always use mlock() or VirtualLock() 
 oh yeah i am always forgetting... with Go there is always cgo for this kind of thing or maybe it can be done with assembler, assembler is preferable because it doesn't complicate the build workflow

not sure about such kernel specific calls and build workflow but probably it's something you can just dump into the symbol tabel and voila 
 The compiler will unwind your statments, it MAY accept a read-after-write as a workaround, but unless that value is being used anywhere else the compiler might be smart enough to remove it away again. You need to be able to tell the compiler to leave your procedure alone not matter what it wants to optimize. 
 that's just not true, bitcoin core developers were well aware of the edge cases of memset-before-deallocation for a long time

here's the commit from 2012 that introduces use of OPENSSL_cleanse() instead of memset:
https://github.com/bitcoin/bitcoin/commit/0f8a6477825fbaad0d37233bdd3011d748f607ab

in addition to cleaning up, secure_allocator uses a pool of locked memory to keep wallet keys and such out of swap space

memset, where it is used for cleaning up keys, is used with an extra memory barrier to prevent it from being optimized away. this is the same method used many other libraries such as BoringSSL as well and even the linux kernel 
 Libsecp256k1 not the Bitcoin Core application. I included the compare link you can see for yourself.  
 that commit is an improvement, mostly with regard to stack memory-which tends to be overwritten quickly anyway, secp256k1 doesn't manage dynamic memory

your claim was that we weren't aware of the issue which is just bs 
 I'm very aware of it's edge cases, which is why "insecure" is in quotes.  
 More than anything I would consider it a guard. Initializing stack memory during procedure entry, and after is really a precaution imo. Didn't say it wasn't an improvment, nor makes the previous implementations insecure. Simply calling out the use case. The call did nothing on plenty if not most platforms, yet it was included. The entire point of calling memset at the end of a procedure on stack memory was the intention to clear stack memory, when it was likely NOT doing that at all. 

That's a mistake in my book.  
 yes-mind that the work that was just merged has been underway in three PRs, since at least 2017:

https://github.com/bitcoin-core/secp256k1/pull/448
https://github.com/bitcoin-core/secp256k1/pull/636
https://github.com/bitcoin-core/secp256k1/pull/1579

it took a long time because there's an unfortunate lack of reviewers for any change to secp256k1
i wish people would get involved more instead of criticizing and screaming "bitcoin core devs suck" from the sidelines 
 i would very much like to spend a whole month reviewing that library and porting it to pure Go and inserting some SIMD instructions where relevant 
 Reviewers are not contributors, that's a project issue. It doesn't look like there is a lack of contributions. I don't know what the processes now, but I don't remember it being worth many people's time to become part of the project because if their red tape.

I'm not suggesting Bitcoin-Core developers suck as a whole, most are probably far more intelligent than myself. I'm saying they made a pretty silly mistake that MANY if not MOST other developers make and that took way too long to make it to release (that should have been a patch release). 

I also think criticism shouldn't be stifled for the sake of "if you can do it better, just do it yourself" doesn't fix the problem that everyone else depends on X project and they have obvious problems they aren't fixing etc. 

Could I have worded my original note better, probably, do I think I shouldn't have said anything at all. No.  
 At this point I'm considering vendoring the entire project for noscrypt. I'll have to review open PRs to see what else is missing I could fast-track.  
 go always zeroes memory when allocated. always. always. always.

you should follow this with C allocations