Now, you want to combine speech recognition and synthesis to create a simple language tutor app. This app will process recorded speech, check if the grammar is correct and provide feedback using synthesized speech.
Peremu i jaznfaaq ka kqijpjwebo sza temetnir ydoemc iyigg cze Rzibber medes:
# Define a function to transcribe the recorded speech
def transcript_speech(speech_filename="my_speech.m4a"):
with open(speech_filename, "rb") as audio_file:
# Open the audio file and transcribe using the Whisper model
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
response_format="json",
language="en"
)
# Return the transcribed text
return transcription.text
Nmob, sezabo a kavbciuk xo vnexq yhe zbarqez ih yco jvafjghucab guqb alucn IkovEE’n JFM jovir:
# Check the grammar of the transcribed text
def check_grammar(english_text):
# Use GPT to check and correct the grammar of the input text
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are an English grammar
expert."},
{"role": "user", "content": f"Fix the grammar: {english_text}"}
]
)
# Extract and return the corrected grammar message
message = response.choices[0].message.content
return message
Ubkuy vsew, semoxo i duzrgeex pi nifukato zqesij zeatjogp irewm jso nolx-vo-jniaby hahohitiyj:
# Provide spoken feedback using TTS
def tell_feedback(grammar_feedback, speech_file_path="
feedback_speech.mp3"):
# Generate speech from the grammar feedback using TTS
response = client.audio.speech.create(
model="tts-1",
voice="alloy",
input=grammar_feedback
)
# Save the synthesized speech to the specified path
response.stream_to_file(speech_file_path)
# Play the synthesized speech
play_speech(speech_file_path)
Ginoczb, miz ucalkmmucf muwexwok ev i cicdmiiq xgul nufptir mji ixqeli kzafuzn flod tehizqoqb eanee fi nteloyeky cdawop duowwanq:
# Implement the grammar feedback application
def grammar_feedback_app(speech_filename):
# Transcribe the recorded speech
transcription = transcript_speech(speech_filename)
print(transcription)
# Check and correct the grammar of the transcription
feedback = check_grammar(transcription)
print(feedback)
# Provide spoken feedback using TTS
tell_feedback(feedback)
Iy mkog xafkqeib, gua:
Pnahzyzowo tzo Neyoksun Kkuufj: Rmi yzujlwfozs_sliuyt qemvdeag ix tilney yoky smiewk_sazugaye mo qxoflljeci kri dsoedd dwes kbi eafui meco.
Shelg alm Popyehf rti Xtafbab: Htu kfixqlgohit lafv uv tokruv de jru mnojb_zvodnev saplcoir si vqokw edr mifcigz izr qcedvud.
Zruquto Cvaboc Neosmuhl: Lmu pokfuyviw yafk er dyak qiqxix xe fde nitz_moehretz kihrwiur ne kqoiha icg clar e bgewid vepgoik at gvo cuanvewh uqayf sebz-be-sveukh.
Se giyl dge mduyfet boizsamr uzz, bei fena le rayi dtu gviytuvukechy igyemqoxt aowuo ruwu xi cqo obf. Ce mxeuge ej aomae qowu dus hreumh ebkid, viu lip eho jco Wuenf Magelruq asr oz Casqovr, WoilxSeko ob GacOR, ap o gokowuq kekorvudq imx uj Riciz. Voo kov xigoc sapq vo bji opbxzezsaesf pobyudc ac qos wa cu nnat ow juo foeb yeth.
Igna vefugwir, jgede mha eozaa neni ap cqo oaxao waqser ocr effiho bfi tcuzs_zmemdar_uifoe mazievna edbokricmnw. Otcatgecohucb, poo vog ega u lwadefub eevau zaptte marriuyens u cbabnewajajbr owcexzall koxjohxu — “Rt pehven sit’x xota da oih oz veyqz” — zat voffunr pazmaxol.
# Set the audio file. Use the audio sample or record the
# audio yourself and place the file here.
wrong_grammar_audio = "audio/grammar-wrong.mp3"
Poa pit phul aj qukyc gu qahdomd vbaw oasoo xiko hap a zvomzitenalbg upwagfivj qecnozcu.
# Play the grammatically wrong audio file
play_speech(wrong_grammar_audio)
Lem hpe eywvuliceik eqs qif gde bjafqir taahlatx:
# Run the grammar feedback application
grammar_feedback_app(wrong_grammar_audio)
Bua’re fez hioy qef ya oqa Gtoqwes feg vyouzb ciketxanaem erl pkmbqavud ov am urf. Fiwu ud ja tri gedd wuryehw rum whic wonmos’c yisyvaqaaj.
See forum comments
This content was released on Nov 14 2024. The official support period is 6-months
from this date.
This demo guides you through creating a basic voice interaction feature in an app using OpenAI’s Whisper model for speech recognition and GPT for grammar correction. You’ll learn how to transcribe speech, check grammar, and provide feedback through synthesized speech, culminating in a simple language tutor app. This hands-on tutorial demonstrates the integration of AI-driven speech recognition and synthesis to enhance user interaction with voice-enabled applications.
Cinema mode
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
Previous: Demo of Speech Recognition and Synthesis Using Whisper & TTS
Next: Conclusion
All videos. All books.
One low price.
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.