Sometimes as you're developing a project it's easy to lose site of the end goal. I've spent so long on giving Listener context setup, code-to-speech converters, etc. that the actual point of the whole thing (i.e. letting you dictate text into an editor) hasn't actually been built. Today I started on that a bit more. I got the spike-test for a dbus service integrated into the main GUI application, so that you can actually get events from the same Listener that's showing you the app-tray icon and the results. I disabled the sending of "level" events across DBus, as there's a ridiculous number of them and client apps likely don't need them (they're just used to show a level indicator during speech).
Then I started working on the actual "interpret the utterances" code, which will be user-driven eventually, but for now I'm just hard-coding the rules. The easiest set are the punctuation bits, where generally the dictation is ',comma' and directly maps to ',' with a few bits and bobs for getting the spaces fixed up. The next easiest are bits such as "delete spaces between successive open-parens"... and that's where I stopped today. Currently this is regex-based; eventually it's going to need to have a bit more intelligence (meta-dictation such as "cap" or "all-caps"), but I don't want to build *too* much intelligence into it; I voice-coded successfully for years with a system that was not much more involved than what I've got now, and I really should get to the "integrate the voice dictation into my editor and start really working with it to work out the kinks" phase soon.
Pingbacks are closed.