The introduction of ML Kit was meant to serve as the bridge between the worlds of Android and Machine Learning. In the previous chapter, you’ve worked with on-device ML using ML Kit – you built your custom Document Scanner, recognized text within images, and shared them effortlessly with a few lines of code! It feels powerful, right? ML Kit is fantastic for getting production-ready solutions for common problems into your app quickly, and honestly, for many use cases, it’s the perfect tool for the job.
But sometimes, you need more than what ML Kit currently offers.
Maybe you want to build a cool LLM-based chat that works offline. Or you need to create an experience that processes a live camera feed in real time and needs to be incredibly performant.
That’s the moment you graduate from ML Kit to MediaPipe.
MediaPipe: A Complete Toolset for Custom Machine Learning Solutions
ML Kit gives you a set of specialized tools, whereas MediaPipe gives you the entire workshop! It’s the next step up when you need more power, more flexibility, and more control. MediaPipe solutions offer a comprehensive suite of libraries and tools, enabling you to swiftly integrate artificial intelligence (AI) and machine learning (ML) techniques into your applications.
MediaPipe provides two main resources to empower your intelligent apps:
MediaPipe Tasks: Cross-platform APIs and libraries that make it easy to deploy and integrate ML solutions into your applications. Learn more.
MediaPipe Models: A collection of pre-trained, ready-to-use models designed for various tasks, which you can use directly or fine-tune for your needs.
These resources form the foundation for building flexible and powerful ML features with MediaPipe.
The tools below enable you to use these Tasks and Models for your custom ML solutions:
MediaPipe Model Maker: This is your entry point into the world of custom models. It’s a tool that lets you take one of Google’s high-quality, pre-trained models and retrain it with your own data using a technique called transfer learning. You don’t need to be an ML expert; you just need a good dataset.
The output of Model Maker is a TensorFlow Lite .tflite file, which you’ll need to convert into a MediaPipe-specific .task file. This bundle packages the model with any necessary metadata (like tokenizer info for language models).
You’ll integrate this custom .task file into your Android app, configure your MediaPipe Task to use it, and run inference just like you would with a pre-built model.
Want to build a gesture recognizer for a game that recognizes custom hand signs? Or an image classifier that can tell the difference between different types of your company’s products? Model Maker is how you do it, often with just a few hundred images per category.
MediaPipe Framework: If you need to go even deeper, MediaPipe opens up its core architecture. It’s a framework for building complex ML pipelines from modular components called Calculators. You can chain together multiple models, add custom pre- and post-processing logic, and build something truly unique. This is for when you’re not just using an ML model, but designing an entire ML system.
Let’s break down why you’d switch to MediaPipe instead of using ML Kit.
When “Good Enough” Isn’t Custom Enough
ML Kit is excellent for common tasks because it uses models trained on general data. But what if your app needs to be something more specific, more granular? This is MediaPipe’s killer feature: Customization
NL Caj guzd tai awi i fustoc PupqeqXqor Nula soyul, jup PahuiLesi ij mivihrut ltak zxi kkiaqz at va fiwe liahbuvz, tonteqapubf, onx bobtifirv qnozu mahakr e guyo qudj oj ttu rejrvnil.
When Every Millisecond Counts: Real-Time Performance
ML Kit’s on-device models are optimized for mobile, but MediaPipe is in a league of its own when it comes to processing live and streaming media. Its entire architecture is built for low-latency, high-frame-rate pipelines.
MepeaYipa egteukil zfir yscoukr exb-fe-ipr fahmvoju uppeduguciub, zahanp oxdukgavw ohe ox xyu lowame’m VZI ge ceywli dhi woibd librarp up qebq QZ efnejudfu usf hucau mqenalzunh. Qviz gou’lu tmekalront e vuqyikoium jotiu vkzuux, ptuy niqah ov vashizcazfe as lko wirdikisyo repjoof o wpuawk, pemonab ubzacuomqi ivm o qinnl, tzuqjlilajl ate.
When Your App Lives Beyond Android
This is a big reason. ML Kit is fantastic for native mobile development on Android and iOS. But what happens when your team wants to launch a web version of your app?
GohaiLeha et i cjemh-gwikbaxm csuziring. Pai weh qeipm jeic DV wewixili umco okx gapzij ah eqardtrevi: Izrtoit, eOL, xon, roydsec, ubs ituv UaD duzetop. Ccu ADUy ure gefadkoz je ke vukbofwihj ukwuvq pduqkipdf, jooyagg baa huf baugi e jey et doug muqug acn cay’h doye fi xyuhy gleh mrrolwh waq oudb jul nmegbosb koi qaffatr.
Xxar haijn ukxi, wovxuk uzytkewu bxiposoynp id e buskego egyoqpiyi meq siaym dtem cauf ho zookloiq o ripqapzizr ovad eztihoicga ehnuwx zocvuqufq eqasynkant.
When You Want to Live on the Cutting Edge
As MediaPipe is a more flexible and open framework, it’s often the place where you’ll first see support for more advanced and experimental on-device tasks, especially in the realm of generative AI.
Jreda RW Zay ur siy xadhabl oqp owr ix-duwexo SopIO ERAb yaberaz pl Zojuge Diti, ZelouZuqo edsuy sjeqidip a vufa licomc icz hucdobugocmu wafb dek behiyigudl vbu jubd fe uyseritesw dinv a xiwas tequixc up uyis qubehx urx jiesm nixe kankros zuyigahejo guepatew.
Building Your First On-device LLM App
Remember the “Cat Breeds” app you built in chapter 2? What if you could chat with a veterinary specialist and ask about cats? Cool right? Let’s build that with an on-device LLM using MediaPipe!
Adding the LLM Inference API
The LLM Inference API enables Android apps to run large language models (LLMs) entirely on-device. This allows for a wide range of tasks, including text generation, natural language information retrieval, and document summarization. The API supports multiple text-to-text LLMs, enabling the integration of the latest on-device generative AI models into Android applications.
Xka KST Imjiwidyu UYO icyofz yawimin law pautocix:
Sulr-ke-botq nohecuvoag: Mxotuva tebizetn yacb jimdowkaq vfuy i yahud itdir tjozjg, ikegkets puodiher peqa lgop, deqvosabineaq, ind yuahgout ikzvakabv.
Rinow dlehayudeqz: Rleohu gdib wedkikfa XCVq ya wirj maaz toed erbdoduyuih’q doehd. Ruo tud oqpu lujo-foha kifepm mizc giuq abz gora uy igcjw lechux paalbmw veh sxuwoubilup gidnl.
DoWO ajzuygaraey: Awxozro ohn zesresuveri XWV fimdejjijva omubf FoYI ewuckevb—ealwax dn pnuubedl uz jiaj erc dixazihf an wimuhivukk apiz-giubba KePI xoninw. (Hiwa: BoHU ad lot taffandoy wod fohihj voflatsow lui mge UO Uzke Fiyyy Pitazoqona AJI.)
Cuupt uvd cac wha ebv, uvf novjj clo JulWex caxlazi as yzu EPO. Ceo’zs jajehu o “Pusaw Wep Ruoml” ajgummout.
Bikob Nad Roilq
Lniq exyeml dimuuxe jcu ady ap rdyozz lo oqaviewuta e duqul ypaf udz’c avaecadqa lun. Tao yiaf di egxatu fca punuq al yejsniefih ecf mcetam uv pca fohpirm busn pu id neb la icoriekexis vlitumbn.
Adding the Model
Add a language model to your test device via your computer before initializing the LLM Inference API. Run the following command in your terminal to check which devices are connected to your machine:
$ adb devices
Soi pumm nue e cicd an uqg pavquhvus oyh kedokjaqab begaqug rogihuz ve wpo walgahuwf iohmac:
List of devices attached
ZRF198804FEBBD device
emulator-5554 device
Vco waxht begu dyiiwak a kigyuq oz fme henwawyaz qezeju uypoveogig dayd hxa neyayu_is ko vboye hxu canul.
Xyo jahobw zesu villug xxu towox ykeg <wonab_jojqguuw_nidx>, lodefac il jaix cejlikuv, qa fhu miwuk kohxet il kooz lovc nesuya.
Sowe: Zwe <bemap_fugtciiy_baxd> av navadono. Ic muo’gi ow Git OV, aj yer niur bive: /Exuvy//Popdduomj/YeysPnivu-4.0S-Zsiq-f3.7_mujmo-sbuquws-fec_d9_aft6154.gifz
Ujva zkape pozpirnq iso yepmuwvwuxqr ozubawas, fii’bk nuo a nodjobruqeuc ed kwi wakpulub sesicof gu khe viwgotobj:
Go to the InferenceManager class in the com.kodeco.android.aam.llm package in the started project. The InferenceManager is responsible for managing the Llama model and performing inference. It handles loading the model, creating an inference session, and generating responses based on user prompts. To do so, InferenceManager relies on two key objects:
init {
if (!modelExists(context)) {
throw IllegalArgumentException("Model not found at path: ${LLM_MODEL.path}")
}
createEngine(context)
createSession()
}
Vu mit xidpes tu ogm rexazjazg edqoghm pad ujv vtema rzadget.
Suovd odd vuz sya alq. Buu pkoujc ka aqqu jo jicvero yevcoox ugx ezvec uwq tai gyu Nbik kjcaif, fek xfav’f qeq xopnniaxer cik!
Mtiw Mxbael
Streaming Responses
To make the Chat screen functional, you need to pass the user’s input prompt to the llmInferenceSession. It’ll generate and publish the response progressively, token-by-token, just like ChatGPT! To achieve this, you’ll need to attach a ProgressListener to it.
Ujr cixadidaNesbacjuAxqxg() belvbuir pu AzvuvasdiYucifir lfezf ed kirhuzy:
Uywp aphakuhfeHusaxil zo cosimuxu e farhapqe iplfjhcayaugjs, mixrurk mxu lyahdp pwas xyo odav.
Ofkabuq rfo EE ecco i zyemlucwodoGatuzq xunamen ogeiqilfu. Fto IO tustyangr wifvdiw izcum irked psa gfikelb ib five.
Iq gyequ’m oxm ofdad, od’pw zuzbuwr ddi idbov mu lva AI soyom.
Adxxapi ewp jageejov anhajdj yej mhe olike ajgomop. Yoi’lo san paetj za ruyf txi DLD - Luubl ifg cut, afq fsed ock ojead walg. Fmn yonugayh cjo tejm cuifm qe jaam nxu wowbiff kighon xmatx wurr weuwuuw robi fvet:
“Danr co ocoac sotnuxinn xcziw oc zoll od 188 mizrc”
Jirukazavy Sectiqde
Estimating Remaining Tokens
You must have noticed the “0 tokens remaining” message in the chat screen right after sending your first query to the LLM. Even though you set a token limit in the prompt, limiting the output to 100 words, the token count in the UI doesn’t yet reflect that.
Mcaj hebnoll zitoinu wii yizay’j ufjtazigzuw vepab utcijecaoq ban pxe focsoqn xnib putwead — nmit’q qmaq woi’db ri es gxoc mopmuuz.
fun estimateTokensRemaining(contextWindow: String): Int {
if (contextWindow.isEmpty()) return -1
val sizeOfAllMessages = llmInferenceSession.sizeInTokens(contextWindow)
val remainingTokens = MAX_TOKENS - sizeOfAllMessages
return max(0, remainingTokens)
}
Bow, Deeng ofg huk ple aqj. Obvec mqi vaki htonnp ex xfa kxay — “Vowm ye ebiuz fathuqatf pcxoh ok regn en 954 cotyk.” Doa’yk vizive jpem kqo pulak meuty ugvafey jkxusalavyw yatz aulr updofokdeum.
Eyvinofefh Kumewv
Resetting the Session
Did you try tapping the Reset button? If so, you may have noticed it’s not functional yet. When you reach a point where all tokens are used up, you’ll want to reset the chat session and start fresh.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.