Hello World! This is the very first post of MakeAI and I have got
something really cool for geeks. Most of you might be familiar with Apple’s Siri and JARVIS, IronMan’s virtual personal assistant. Ever wanted to make such an AI yourself? You have come to the right place.
Our JARVIS:
Okay, Lets get to business. First of all lets start with the list of stuff you need to make your own version of AI:
WinRar
Visual Studio 2010 or Visual Studio 2013 with Visual Basic and C#.
System requirements: If you can run Visual Studio on your machine, you are all set!.
An Internet Connection.
Some Google Accounts (i will tell you why you need more than one account later in this post)
Once you meet the above listed requirements, Follow the instructions below.(Note that simply downloading the source code and compiling it will not work. Please be patient and go through all the steps so that you will not face any problems later)
Click here to download the source codes (its in VB.NET & C#)
Extract the downloaded RAR file to a new folder.
A short-note on Speech recognition – please read it, its really important to uderstand whats going on under the hood:
We want the program to recognize the users speech efficiently.For this purpose, the program makes use of both built in Windows Speech Recognition Engine (System.Speech.Recognition) which works offline and Google Speech API V2 which works online.
Why Use 2 Speech Recognition Engines?
The Windows Speech Recognition Engine is good at recognize predefined phrases (in which case the program is provided a list of words (called Grammar) which will be spoken by the user). It is also customizable, as we can add non-dictionary words (like the name of an Indian friend) to the Grammar. But I dont recomend Windows Speech Recognition for ‘Free Dictation’, (where the user might say anything), because most of the time, the output text has no relation with what you said.
Other notable feature of Windows Speech Recogntion:
On the other hand Google speech recognition is really good at recognizing anything you say, but you cant add non-dictionary words to its Grammar.It works completely online and you need a fast internet connection for good results. Also the recognition algorithm is different from that of Windows Speech Recognition – the program uploads the audio file to the server-> The speech is converted to text -> The server returns the recognized text. This means that you will have to press a button on the screen before and after you speak.Also the API is available only to developers, and using it is tricky.
The Solution:
Use both. We define a grammar containig most commonly used words (like ‘HAI’,’Hello’, etc.,) and non-dictionary words like ‘JARVIS’. We use Windows Speech Recognition to recognize these words.
If the user speaks something thats not listed in the grammar, we send the recognized audio to Google for recognition. The recognized text is then saved to the grammar, so that next time the user speaks the same thing, it can be handled by Windows Speech Recognition itself, rather than uploading the file to Google, which is more time and resource consuming.
To get speech recognition working, follow these instructions:
You need something called an API Key to get Google Speech Recognition working. You can make only 50 queries per day per API. So it is recommended that you sign-up for multiple Google Accounts. You should then create an API Key for each of the accounts. This is how you get the Speech API key for a given Google Account:
abcd,efgh,hijk,lmno
Note: no comma at the end!
HAI,HELLO,HOW ARE YOU,JARVIS,WAKE UP JARVIS
Setup Windows Speech recognition from Conrtol Panel : Control Panel\All Control Panel Items\Speech Recognition
Congrats!!! – You have successfully set up Speech Recognition for JARVIS! Next step is to make JARVIS intelligent :
JARVIS has built-in intelligence. But to access things like Weather reports, search the web, stock exchange, mathematical calculations, biographies of people etc, you need to enable the Wolfram|Alpha API. For this, you need a Wolfram|Alpha API KEY, which you can get from here.
Save your Wolfram|Alpha API key in all the files named ‘WolframKey.txt’ (There will be 2 instances).Again remember – no extra white spaces.
You are all set! Open the Project Visual Studio and press F5 on the keyboard!
Using the program:
‘Hello’
‘How many files do I have in the music folder’
‘please call 9992298912′
‘please Open Wordpad’
‘Weather report Washington DC’
‘When was Mahathma Gandhi born’
‘Remember that I have a meeting on 22nd April’ (You can later ask ‘When do I have meeting’)
‘Integrate zero to 10 x cube dx’
‘Search for Bill Gates’
Our JARVIS:
Okay, Lets get to business. First of all lets start with the list of stuff you need to make your own version of AI:
WinRar
Visual Studio 2010 or Visual Studio 2013 with Visual Basic and C#.
System requirements: If you can run Visual Studio on your machine, you are all set!.
An Internet Connection.
Some Google Accounts (i will tell you why you need more than one account later in this post)
Once you meet the above listed requirements, Follow the instructions below.(Note that simply downloading the source code and compiling it will not work. Please be patient and go through all the steps so that you will not face any problems later)
Click here to download the source codes (its in VB.NET & C#)
Extract the downloaded RAR file to a new folder.
A short-note on Speech recognition – please read it, its really important to uderstand whats going on under the hood:
We want the program to recognize the users speech efficiently.For this purpose, the program makes use of both built in Windows Speech Recognition Engine (System.Speech.Recognition) which works offline and Google Speech API V2 which works online.
Why Use 2 Speech Recognition Engines?
The Windows Speech Recognition Engine is good at recognize predefined phrases (in which case the program is provided a list of words (called Grammar) which will be spoken by the user). It is also customizable, as we can add non-dictionary words (like the name of an Indian friend) to the Grammar. But I dont recomend Windows Speech Recognition for ‘Free Dictation’, (where the user might say anything), because most of the time, the output text has no relation with what you said.
Other notable feature of Windows Speech Recogntion:
- You need not press a button before and after you speak something. The program automatically recognizes when you start (and stop) speaking.
- Windows Speech Recognition Works Offline.
On the other hand Google speech recognition is really good at recognizing anything you say, but you cant add non-dictionary words to its Grammar.It works completely online and you need a fast internet connection for good results. Also the recognition algorithm is different from that of Windows Speech Recognition – the program uploads the audio file to the server-> The speech is converted to text -> The server returns the recognized text. This means that you will have to press a button on the screen before and after you speak.Also the API is available only to developers, and using it is tricky.
The Solution:
Use both. We define a grammar containig most commonly used words (like ‘HAI’,’Hello’, etc.,) and non-dictionary words like ‘JARVIS’. We use Windows Speech Recognition to recognize these words.
If the user speaks something thats not listed in the grammar, we send the recognized audio to Google for recognition. The recognized text is then saved to the grammar, so that next time the user speaks the same thing, it can be handled by Windows Speech Recognition itself, rather than uploading the file to Google, which is more time and resource consuming.
To get speech recognition working, follow these instructions:
You need something called an API Key to get Google Speech Recognition working. You can make only 50 queries per day per API. So it is recommended that you sign-up for multiple Google Accounts. You should then create an API Key for each of the accounts. This is how you get the Speech API key for a given Google Account:
- Join this Group with your Google Account : https://groups.google.com/a/chromium.org/forum/#!forum/chromium-dev
- Go to https://console.developers.google.com/
- Sign in, and start a new Project. Click on your Project Link, and select ‘Enable an API’.
- A list of available APIs appears. Enable the Speech API. It will appear only if you have joined the Group given above.
- Under the APIs & Auths section, you will see a link ‘Credentials’. Click it, and then click the ‘Create new Key’ that appears. A dialog will pop-up,
- Click Browser key. The following window will pop-up, asking for a refrrer. Never mind, simply click ‘Create’
- You will get an API Key for the Google Account using which you signed in.Repeat these steps for each of your Google Account.
- In the extracted folder, locate the file Keys.txt (here is the exact path: ‘Project JARVIS\JARVIS UI\bin\Debug\Keys.txt’)
- Open the file in Notepad. Write the API Keys to the file, seperated by commas.
abcd,efgh,hijk,lmno
Note: no comma at the end!
- Save the file.
- There is a file called Dictionary.txt in the same location as that of Keys.txt (here is the exact path: ‘Project JARVIS\JARVIS UI\bin\Debug\Dictionary.txt’)
- This file contains commonly spoken words and non-dictionary words that should be recognized by the Windows Speech Recognizer. You can edit this file, but be careful – dont leave it empty, dont leave any unneccessary blank spaces and dont put any commas at the end of the file. Also, dont put 2 consecutive commas in the file. Everything should be in upper-case.
HAI,HELLO,HOW ARE YOU,JARVIS,WAKE UP JARVIS
Setup Windows Speech recognition from Conrtol Panel : Control Panel\All Control Panel Items\Speech Recognition
Congrats!!! – You have successfully set up Speech Recognition for JARVIS! Next step is to make JARVIS intelligent :
JARVIS has built-in intelligence. But to access things like Weather reports, search the web, stock exchange, mathematical calculations, biographies of people etc, you need to enable the Wolfram|Alpha API. For this, you need a Wolfram|Alpha API KEY, which you can get from here.
Save your Wolfram|Alpha API key in all the files named ‘WolframKey.txt’ (There will be 2 instances).Again remember – no extra white spaces.
You are all set! Open the Project Visual Studio and press F5 on the keyboard!
Using the program:
- Start the program.
- Wait for the brain to warm up.
- Make sure you have an internet connection. If you dont have an internet connection, a red X mark will appear on the internet icon (bottom-right of the JARVIS window).
- Once connected to the internet, Click the ‘Settings Button’ (Bottom-right).
- Select ‘Hybrid’ as your Voice Recognizer to ensure efficient usage of Windows and Google Speech recgognition engine.
- If you dont have internet connection, you can select ‘Offline’ as your recognizer. This will affect the accuracy of sppech recognition.
- Use a good microphone for Speech Recognition. Quality of Microphone matters.
- Tap the giant blue microphone button to turn on/off Speech recognition.
- Tell JARVIS anything. Here are some sample inputs to get you started:
‘Hello’
‘How many files do I have in the music folder’
‘please call 9992298912′
‘please Open Wordpad’
‘Weather report Washington DC’
‘When was Mahathma Gandhi born’
‘Remember that I have a meeting on 22nd April’ (You can later ask ‘When do I have meeting’)
‘Integrate zero to 10 x cube dx’
‘Search for Bill Gates’
- If the system is not recognizing something correctly, simply type it into the Input box (at bottom of the JARVIS main window) and press Enter. Next time, the speech recognizer will recognize that input correctly!
This comment has been removed by the author.
ReplyDeleteWoah! Can't wait to try this tommorrow! Do I have to click the mic everytime.. Can I make it respond to a press like the space bar as a butron?
ReplyDeleteEmail me here:
owenparsonsop@gmail.com
thanks PAL
ReplyDeletecan you make better version. this is slight buggy
ReplyDeletethank you so much for explaining this, for newbies like me.
ReplyDeleteas I can do to change the recognition language ??
ReplyDeleteThe link of the source code is dead
ReplyDeleteClick Here , Linked me to 4shared and empty folder.
ReplyDeletedownload link is not working can update it?
Deletethe download link do not work buddy,could u please send me it in ken.engenharia@yahoo.fr
ReplyDeletehttps://github.com/farizrahman4u/jarvis
ReplyDeleteFARIZ RAHMAN my pc highly hang bro... help me
ReplyDeletehow can change jarvis' name?
ReplyDeleteHow to edit wolfram app? ... help me
AI Personal Assistant technology is a new and popular technology that is buzzing across the world. This technology is adopted in mostly every sector and makes the life more comfortable and easy for the people.
ReplyDeletethe link for the source code says "The file link that you requested is not valid". please help.
ReplyDeletemanh is the google speech api free
ReplyDeleteit cant cancel the meeting on april 22nd
ReplyDeletevery nice..... congratz... am too trying to do the samething..
ReplyDeleteHow can I install more command in Jarvis
ReplyDeletedude its simply awesome....thanks
ReplyDeletehow can we change the voice of the assistant , if i want to use different text to speech engine .
ReplyDeleteAI innovation is another and well known innovation that is humming over the world. This innovation is embraced in generally every part and makes the life more agreeable and simple for the general population.
ReplyDeleteNice One, I have this and successfully doing productive tasks instead of me opening the applications for certain projects. thank you. ^_^
ReplyDeletecan you link a decompiled version of all the core dll files and the brain dll file
ReplyDeleteJ.C. Gaming to add two online slots and one - JTM Hub
ReplyDeleteThe new games will 성남 출장마사지 feature the 상주 출장샵 popular 서울특별 출장안마 Multi-slots brand, which is popular 거제 출장샵 among slot enthusiasts because of the ease of play. 경상남도 출장마사지