I should have said "dedicated hardware"; I'd hoped that was obvious. I'm just not sure what I'm _buying_ if I buy an Alexa or dot. Obviously a speaker and microphone (and the associated amplifiers), but I've already got those in the laptop. Obviously some firmware and hardware to "connect to the internet", but again I've already got that.
Alexa is buying a microphone array, with phased array capability.
This might allow "focusing" on your voice, improving the SNR.
There might be a Wiki with model information. Or maybe you
can find a picture of a PCB with labels on it.
The single microphone you refer to, a lot of those are "crap",
and if you've ever used speech to text, the software tells
you "no signal detected". That's what happens when you use
the wrong single microphone, plus you put computer fans behind
your head, to make a worse SNR.
If you don't like the Alexa packaging, it's possible there
are third party array microphones with a similar capability.
I've spotted 7 mic arrays and 16 mic arrays. Alexa might
have fewer of them, arranged around the circumference of
the device. (Maybe five microphones, check Wiki)
You want an electret which is biased properly. Some microphones
don't work, because the voltage value used for bias is wrong.
This happens because buying a microphone as a separate item,
nobody knows whether the device spec, matches the bias on
the laptop.
Even the microphone on the built-in webcam in the laptop,
is crap. And it is crap, because it picks up electrical
noise (various tones, the sound of the mouse moving electrically),
and with AGC in usage, the noise is "cranked" and a voice
assistant is going to hate your input.
I have *one* good microphone here. It's an electret that
runs off a separate 5V supply, plus it has a four pin amplifier
after it. It gives line level (~1Vrms) output. It works at a
distance of 2-3 feet from my mouth. But since it is a single
microphone, it cannot remove all the fan noise around me. I could
kill all the fans in here, every last one... but where is the
fun in that.
7 microphones (MEMS type, not electret)
https://www.minidsp.com/products/usb-audio-interface/uma-8-microphone-array
"Step by step app notes for Google SDK, Alexa-Pi, Cortana, Siri, IBM Watson and Matlab. More coming up soon..."
16 microphones (MEMS, description not as featureful (like it is a lab animal for someone)).
https://www.minidsp.com/products/usb-audio-interface/uma-16-microphone-array
Alexa-Pi is an Alexa client that runs on RPi.
Page features a redirect to some other project :-)
Projects like this may require a client-key so
Alexa will actually listen to your input. It's not
a given, just because you have a box that "formats"
the data properly, it will "just work". That's too easy.
https://github.com/alexa-pi
That's even assuming you need Alexa skills for what
you want in the first place. There might be some other,
local means, of implementing your Smart Home.
The seven channel device, should produce one or two channels
of output, as if it was a "naive" microphone. The difference
is, it's done the noise reduction by using the phased array
to pick out speech coming from a point in space. This reduces
the influence of the fans, which are not in the beamform-selected area.
There was even a version of array microphone (shaped like a salad bowl),
used for recording rock groups in a studio. You place it out in front
of the band, and after the recording session is finished
(recording the raw channels), the software can used the phased array
concept to pick out just the drums, just the vocalist,
and "synthesize" tracks as if each player had a discrete microphone
in front of them. You would likely use conventional microphones
anyway, in that studio, so you'd have a fallback if the fancy
method could not isolate one of the sources. Maybe the guitar
pickup would still be conventional for the electric guitar.
The whole idea, is to clean up the audio to the point
any further software is not complaining about the SNR.
Paul