Boffins convert typing sounds into text with 95% accuracy

kpw · 1 year ago

Boffins convert typing sounds into text with 95% accuracy

@lolcatnip@reddthat.com · 1 year ago

This is why I always make sure there are no boffins around before I start typing.

@killeronthecorner@lemmy.world · edit-2 1 year ago

If there are boffins around, I start typing out the GDPR guidelines in full

@disheveledWallaby@lemmy.ml · 1 year ago

What about Hornblower’s, Bolger’s, Took’s, Sackville’s or Grubb’s?

@stoy@lemmy.zip · 1 year ago

I wonder if you need to train it on a specific keyboard before it will work it.

Lemminary · 1 year ago

Most likely

@stoy@lemmy.zip · 1 year ago

That would limit the practicallity quite a lot, as deskmats and typing style would change the sound of even a common keyboard.

I also notice that I slightly change my typing style between typing normally and entering my password.

Dark Arc · 1 year ago

That would limit the practicallity quite a lot, as deskmats and typing style would change the sound of even a common keyboard.

Eh… I don’t know if it would be enough of a change. Also consider mass produced popular laptops (e.g. targeting the MacBook keyboard).

I also notice that I slightly change my typing style between typing normally and entering my password.

I don’t really think that’s normal… But hey, maybe it gives you some protection 🙂

@SacralPlexus@lemmy.world · 1 year ago

deleted by creator

@UraniumBlazer@lemm.ee · 1 year ago

I doubt so. Wouldn’t Zipf’s law be used for this?

@mymy@lemmy.blahaj.zone · 1 year ago

removed by mod

@mymy@lemmy.blahaj.zone · 1 year ago

removed by mod

@NAXLAB@lemmy.world · 1 year ago

I think I might have achieved security through obscurity. My custom keyboard is a unique shape and almost all the keys are one unit. Not only is it different enough from a traditional keyboard that the neural network probably won’t understand it, the function layers I use obscure whether I’m typing a letter at all.

@Outsider9042@lemmy.world · 1 year ago

@ReveredOxygen@sh.itjust.works · 1 year ago

What keyboard is it, corne? I have to admit that your keycaps are incredibly cursed, how you have mixed caps from different layers

@Outsider9042@lemmy.world · 1 year ago

It’s a chocofi.

CTGAP on the base layer, and 6 layers on top of it, using a heavily modified version of Miryoku.

Most of the keycaps are correct, just for different layers. It helps prevent key peeking, plus I like the cursed aesthetic.

@ReveredOxygen@sh.itjust.works · edit-2 1 year ago

that’s a surprisingly cheap keyboard. I ended up ordering a zsa voyager a couple days ago because I wanted keys, but I couldn’t find any prebuilt split keyboards that had a base configuration below like $350. I might end up going with cursed keys on mine, it looks pretty cool

@Mr_Blott@lemmy.world · 1 year ago

Does that come with free fingerless gloves?

@Outsider9042@lemmy.world · 1 year ago

No, but it comes with your choice of flavoured frozen yoghurt.

b000urns · 1 year ago

That’s good!

@joel_feila@lemmy.world · 1 year ago

The yogurt contains potasium benzoate

@MycelialMass@lemmy.world · 1 year ago

That’d bad

@Vathsade@lemmy.ca · 1 year ago

removed by mod

@A_Random_Idiot@lemmy.world · 1 year ago

I have a headache just looking at that.

@Cronch@lemmy.world · 1 year ago

Quite scary considering the accuracy and how many open mics everyone is surrounded by without even realizing it. Not to mention if any content creator types their password while live streaming or recording they could get their accounts stolen.

@vareriu@lemmy.world · 1 year ago

One more reason to switch to a password manager, even though they could still find out the master password…

@qwertyqwertyqwerty@lemmy.one · 1 year ago

Probably still have some safety if you’re using two-factor, or have a master key in addition to a password (e.g. 1Password).

mosiacmango · edit-2 1 year ago

Or use a local password safe like keepass.

Rostby · 1 year ago

Or host it yourself like the smart one you are

@Rai@lemmy.dbzer0.com · 1 year ago

Password manager and the LOUDEST MECHANICAL KEYBOARD POSSIBLE you have NO idea what keys I’m pressing with my blues, bitches

@dubyakay@lemmy.ca · 1 year ago

That’s the whole point though. The louder your keypresses the better.

@Rai@lemmy.dbzer0.com · edit-2 1 year ago

laptop

I don’t think you read the article

A loud ass mech keyboard would fuck this study up

Ænima · 1 year ago

Only if you have to type it in to unlock your vault. Now, bear with me.

Bitwarden (maybe others) lets you set a PIN to unlock your vault. Normally, you would think this is a less secure setup, easier to crack with the method outlined in this article. Except with Bitwarden you have to set up the pin in every browser extension and every app install.

Meaning, unless they have access to your device, the PIN to unlock one instance of Bitwarden could be different from the PIN for another. They also don’t have to be strictly 4-digit PINs, either. I highly recommend password managers, but for my money, Bitwarden has all my love.

Disclaimer: I am on no way affiliated with Bitwarden. But I could be if they paid me!

@noodlejetski@lemm.ee · 1 year ago

but then I have to remember the PIN for each one of my devices. there should be some kind of app for storing those.

Ænima · 1 year ago

Do what users at most businesses do, write it on a sticky note and put it on the underside of your keyboard!

Stick around for more tech tips with a real life sysadmin!

@azertyfun@sh.itjust.works · edit-2 1 year ago

This has been a known attack vector for years, and I wonder how no livestreamer has been (publicly) attacked in this way.

I guess in large part this can be attributed to 2FA, passwords just aren’t worth much by themselves anymore (well I guess if someone is quick enough they can snipe the OTP as well, but streamers are rarely entering their 2FA while streaming since they’re on a trusted device).

In fact the biggest attack vector I’d worry about is the infamous SMS 2FA, which is actually 1FA for password resets, which is actually 0FA “yes dear phone operator I am indeed Mister Beast please move my phone number to this new SIM”.

@mymy@lemmy.blahaj.zone · 1 year ago

removed by mod

@atrielienz@lemmy.world · 1 year ago

How good does this work if there’s other noise pollution? Like music playing etc?

Waldowal · 1 year ago

New policy from the corporate office: If you are working in a public place, like a coffee shop, please scream while typing your login password.

@tourist@lemmy.world · 1 year ago

use the onscreen keyboard

much more secure

why won’t my bank stop calling me

Dark Arc · 1 year ago

On it boss https://youtu.be/HsvyjePPFRs?si=So4iKVWAUPXNjGVe

@PipedLinkBot@feddit.rocks · 1 year ago

Here is an alternative Piped link(s):

https://piped.video/HsvyjePPFRs?si=So4iKVWAUPXNjGVe

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

@sanimalp@lemmy.world · 1 year ago

I screamed my password and now I got hacked. Thanks for nothing!

@laurelraven@lemmy.blahaj.zone · 1 year ago

Not to be a jerk, but is this actually new? I’ve heard of this being done at least ten years ago…

On another note, one way to beat this (to a degree) would be to use an alternate keyboard like Dvorak (though you could probably code it to be able to detect that based on what’s being typed)

@misophist@lemmy.world · 1 year ago

Coding for alternate key mappings is almost as trivial as detecting other languages.

@Buddahriffic@lemmy.world · 1 year ago

It’s more trivial because it’s a 1:1 relationship. A is a, s is o, d is e, and so on. Detecting other languages is harder because there’s more of them and there isn’t a 1:1 conversation to English.

@laurelraven@lemmy.blahaj.zone · 1 year ago

Yeah, that’s what I figured

@barsoap@lemm.ee · 1 year ago

I think it’s largely been a state actor thing. Directional microphone to record your window from across the street, spend significant tax money on crunching numbers on a supercomputer to get at your password kind of thing, I think they already could do it in the 90s. Real-time 95% accuracy on a non-specialised device is a quite different ballpark: Now every skiddie can do it.

skulblaka · 1 year ago

Now every skiddie can do it.

And this is the real, serious problem. Most people are pretty unlikely to stop a state sponsored spy operation no matter how careful they are. It’s barely worth worrying about unless you know for a fact you’re being tapped and that you will be killed about it, and even if you do know this the state can pull some space age bullshit out of their asses that doesn’t yet have a counter. Top secret military industrial research goes into maintaining that exact advantage every year, if they really want to get you, you will get got. But if Joey Dickbeater and his school friends can just point a mic at your window and then upload it to the Pass-o-Gram to decode it, you have a real problem. It’s like when TikTok kids figured out they can steal Kias with usb keys - if every teenager in America knows how to steal your car, its lifetime is going to be measured in minutes. Same with passwords.

Sounds like it’s time to buy a bunch of random cherry switches and randomize them across my keyboard…

@Socsa@sh.itjust.works · 1 year ago

What it means is that NIST probably needs to update its security recommendations to require hardware keys for even low level systems. It’s going to be a huge pain in the ass though.

Ook the Librarian · 1 year ago

Sounds like it’s time to buy a bunch of random cherry switches and randomize them across my keyboard…

And rotate them. While I don’t plan to waste my energy, having hot swap sockets and swapping a few around should thwart the attack. You would have to do it frequently enough that relevant training data gets wasted before it’s useful. I’m pretty paranoid, but not that much.

I’ll just consider it good security hygiene to get a new keyboard often :)

@barsoap@lemm.ee · 1 year ago

Have you considered only re-doing the tinfoil wrapper every day? It should crackle differently every time.

@laurelraven@lemmy.blahaj.zone · 1 year ago

Gotcha, that makes more sense

@frezik@midwest.social · 1 year ago

There has been previous work on this, yes. It required a dictionary of suggested words. That would make it useful for snooping most typing, but not for randomly generated passwords. This new technique doesn’t seem to have that limitation.

Dark Arc · 1 year ago

So about those people that run around saying passphrases are better… 😅

@laurelraven@lemmy.blahaj.zone · 1 year ago

Okay, gotcha. I didn’t look that deeply into it previously so I never realized how limited that was

paraphrand · 1 year ago

Neat, so when my friends are taking about satisfyingly clackety keyboards I can inform them it’s a security hazard.

@ryannathans@aussie.zone · 1 year ago

Good luck, I have a non standard key layout

@Llewellyn@lemm.ee · 1 year ago

It’s still vulnerable to dictionary attacks

@ryannathans@aussie.zone · 1 year ago

Except it’s not

@Wilzax@lemmy.world · 1 year ago

??? If you can map sound to qwerty keystroke placement, then it’s a simple matter of mono alphabetic substitution for other layouts to generate candidate texts. Using a dictionary attack to find more candidate layouts would absolutely work.

@ryannathans@aussie.zone · 1 year ago

No, all the timings change. You can’t just swap out the letters and hope it matches. Additionally I was responding to the poster claiming a dictionary attack on a password would work - only if it’s in the dictionary.

@Wilzax@lemmy.world · 1 year ago

The method is not based on timings. It is based on identifying the unique sound profile of each keystroke

@ryannathans@aussie.zone · 1 year ago

How can you make that claim? They used deep learning, does anyone know what characteristics the AI is using?

@Pheonixdown@lemm.ee · 1 year ago

Dvorak?

@Evotech@lemmy.world · 1 year ago

Middle management will finally get rid of clacky keyboards with this weird trick

AmberPrince · 1 year ago

I’ll accept the risk. I need the clicky

@Duamerthrax@lemmy.world · 1 year ago

Good luck making an acoustic map of the tens thousands of possible case, switch and key cap combinations.

@fruitycoder@sh.itjust.works · 1 year ago

Might have to spend some time getting Easy Effects/Noise Torch set up on my systems again just to reduce the vectors again.

There is a good comment on this post on physical mitigation that seems helpful as well: https://www.reddit.com/r/Fedora/comments/uerp9z/comment/i6p0jqa/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button

@helenslunch@feddit.nl · 1 year ago

Someone explain how this works? Doesn’t make much sense to me how that’s even possible.

@TootSweet@lemmy.world · 1 year ago

This is just me kindof guessing off the top of my head, but:

Depending where the mic is in relation to the keyboard, it can tell to some extent the relative distance from the key to the mic by volume of the keypress.
The casing of the keyboard has a particular shape with particular acoustic properties which would make certain keys sound different than others. (Maybe the ones toward the middle have a more bass sound to them as opposed to more treble in the keys closer to the edges of the keyboard.)
The surface on which the keyboard sits may also resonate differently with different keys.
There may be some extent to which the objects in the room (including the typist and monitor, etc) could have reflected or absorbed soundwaves in ways that would differ depending on the angle at which the soundwaves hit them, which would be affected by the location of the key.
Some keys like the spacebar and left shift almost always have a stabilizer bar which significantly affects the sound of the key for most keyboards.
For human typists, there are patterns in the timing of key presses. It’s quicker to type two keys in succession if those two keys are pressed by different fingers or different hands, for instance. Imaging typing the word “jungle”, for instance. “J”, “u”, and “n” are all pressed with the right index finger (for touch typists). So the first three letters would be slower to type than the rest of the letters.
I’d imagine this method also allowed the program to take into account various aspects of human language. (Probably English in this case, but it could just as well have been another language.) Certain strings of consonants just never appear consecutively. Certain letters are less frequently used. Things like that. Probably the accuracy would have been lower if the subjects were asked to type specific strings of random letters.
It may also be that this particular experiment involved fairly controlled circumstances. They always placed the mic 12cm from the keyboard, for instance. Maybe they also used the exact same keyboard on the exact same desk with the exact same typist for all tests and training. And it sounds like they trained it on known text for a good while before testing the AI by asking the AI to actually discern what was typed. That’s pretty perfect conditions that probably wouldn’t be realistic for an actual attack. Not to minimize the potential privacy imacts of this, though. I’d fully expect methods like this to be more accurate for a more generalized set of cases.

Now, the researchers didn’t sit down and list out all of these (or any other) ways in which software could determine what was typed from audio and compose an algorithm that accounted for all/most/some of these. They just kindof threw a bunch of audio with accompanying “right answers” at a machine learning algorithm and let the algorithm figure out whatever clues it could discern and combine those in whatever way it found most beneficial to come up with an (increasingly-more-accurate-with-every-training-set) answer. It’s likely the algorithm came up with different things than I did that helped it determine which key(s) were being pressed.

Pons_Aelius · 1 year ago

Because of different placement on the keyboard and different finger pressure, each key press has a slightly different sound.

The telling thing in this story is this

with 95 percent accuracy in some cases.

For some people (those with a very consistent typing style on a known keyboard) they were right 95% of the time.

In the real world this type of thing is basically useless as you would need a decent sample of the person typing on a known keyboard for it to work.

To go from keystroke sounds to actual letters, the eggheads recorded a person typing on a 16-inch 2021 MacBook Pro using a phone placed 17cm away and processed the sounds to get signatures of the keystrokes.

So to do this you need to have physical access to the person (to place a microphone nearby) and know what type of device they are typing on and for it to be a device that you have already analysed the sound profile of.

@helenslunch@feddit.nl · 1 year ago

So basically if they know what type of hardware you’re using, and have training on that type of hardware, then it works. It can’t just be literally any keyboard, right?

That makes more sense.

@ILikeBoobies@lemmy.ca · 1 year ago

You don’t need physical access, just some malware that has access to the microphone

We would hope researchers “discovering” this wouldn’t have a production ready product as their proof of concept. So there is room from improvement but military contractors would love to invest in this

Pons_Aelius · 1 year ago

You don’t need physical access, just some malware

Which you still need to have previously installed…

If the person has allowed malware to be installed just install a keylogger (which gives you 100% accuracy every time) rather than jump through more hoops with this.

@ILikeBoobies@lemmy.ca · 1 year ago

Different devices

I would have an easier time infecting someone‘s personal phone than a company machine

Pons_Aelius · 1 year ago

You would, would you?

Well, I must be talking to a leet hacker then.

Ok, install malware on my phone.

@ILikeBoobies@lemmy.ca · 1 year ago

How did you get that from what I said?

Pons_Aelius · edit-2 1 year ago

I would have an easier time infecting someone‘s personal phone than a company machine

What did you mean by this then other than you, personally, are skilled at such things and have system penetration experience?

@agent_flounder@lemmy.world · 1 year ago

The article says

The researchers note that skilled users able to rely on touch typing are harder to detect accurately, with single-key recognition dropping from 64 to 40 percent at the higher speeds enabled by the technique.

Hm. Sounds like “some cases” are hunt and peck typists or very slow touch typists.

I don’t know if training for each victim’s typing is really needed. I get the impression they were identifying unique sounds and converting that to the correct letters. I only skimmed and I didn’t quite understand the description of the mechanisms. Something about deep learning and convolution or…? I think they also said they didn’t use a language model so I could be wrong.

Pons_Aelius · edit-2 1 year ago

The problems is that even with up to 95% accuracy that still means the with a password length of 10 there is a 50/50 chance that one character is wrong.

A password with one character wrong is just as useless as randomly typing.

Which character is wrong and what should it be? You only have 2 or 3 more guess till most systems will lock the account.

This is an interesting academic exercise but there are much better and easier ways to gain access to passwords and systems.

The world is not a bond movie.

Deploying social engineering is much easier than this sort of attack.

prole · edit-2 1 year ago

The world is not a bond movie.

Deploying social engineering is much easier than this sort of attack.

Have you never seen a Bond movie? Yeah they always have a gadget or two, but the rest is basically him social engineering his way through the film. And shooting. Usually lots of shooting too.

@warrenson@lemmy.nz · 1 year ago

“Hearing” the same password twice drastically increases the accuracy, however, social engineering is indeed the most effective and efficient attack method.

@agent_flounder@lemmy.world · 1 year ago

I was thinking of this attack in terms of grabbing emails, documents, stuff like that. Or snippets thereof.

@0xD@infosec.pub · 1 year ago

If the password is not random, as they seldomly are, you can just guess the last, or even the last few characters of they are not correct.

prole · 1 year ago

I imagine it probably also uses an algorithm to attempt to “guess” the next letter (or the full word itself, like your phone keyboard does) based on existing words. Then maybe an LLM can determine which of the potential words are the most likely being typed based on the context.

I dunno if that makes any sense, but that’s how I pictured it working in my brain movies.

@catch22@startrek.website · 1 year ago

They’ll have modelled the acoustic signals to differentiate between different keys. Individual acoustic waves eminating from pressing a key will have features extracted from them to identify them. Opimal featues are then choose to maximise accuracy, such as features that still work when the signal is captured at different distances or angles. With all these types of singsl processing inference models, you never get 100 percent. The claim of 95 percent is actually very high.

@9point6@lemmy.world · 1 year ago

Every key is unique and at a different distance to the microphone and therefore makes tiny differences in noise.

Knowing this, and knowing the frequency distribution of letters in language (e.g. we know “e” is the most common letter) and some clever analysis over a large enough sample of typing, we can figure out what each key sounds like with a statically high level of probability. Once that’s happened it’s just like any other speech recognition software, except it’s the language of your keyboard.

@brlemworld@lemmy.world · 1 year ago

Article doesn’t say but I would guess they are testing with words and using that to build context for better accuracy. I imagine if you are typing some random password it would not be as accurate. Also the only password I type nowadays is the one to unlock the computer, everything else is in a password vault.

Elias Griffin · edit-2 1 year ago

Is it ignorance, indemnity, or conspiracy that this News Media Corporation didn’t give the primary mitigation?

A white noise generator.