machine learning – How code of Voice Assistants are structured?

I am curious how voice assistants like Siri, Alexa, Google, Alisa and others…
how they process commands. Is it lots of regexes?
I’ve experimented with mobile app that analyzes the text of what user said with regex pattern. And if it has a match with regex, it executes an action.
For ex, commands like: “30 minutes to workout”, “50 minutes for homework session”.
I thought of how the user will ask for timer and these are the types of sentences I came up with.

I’ve heard of Natural Language Processing, but really don’t understand what it is.

Is it all just bunches of regexes? So how they can be structured?

Just how the documentation file is led up if all the features are human handpicked; reading it from code seems unreal, so I guess there should be some great documentations/ structure document aside?
How these voice assistants are organized? (from the stand point of text analysis)

Or is it pre-learned machine learning model file?

Pls, correct me if the question is formulated not so well and write how to improve.