Each millennial parent knows there are two big milestones in a kid’s life: At the age of 5, their brain reaches 90% of its growth. At the age of 11, they receive their Hogwarts Letter!
Each millennial parent knows there are two big milestones in a kid’s life:
Children’s brains develop connections faster in the first 5 years than at any other time in their lives. This means that the time we spend with our kids until the age of 5 matters most. It shapes their character, social and emotional growth.
In the Muggle [non-wizarding] world, by the age of 5 most kids join the regular cookie-cutter schooling system and start developing their academic skills. However, by then their linguistic, social and emotional capabilities would have been heavily influenced by the content they consumed.
Quality early childhood experiences have been correlated to several positive life indicators such as full-time employment, owning a house and a car, a savings account and having positive relationships with family members!
As a parent, you might be feeling the added pressure of the decision of what activities to engage them in, what skills to focus on and what content to subject your child.
Over the past 18 months, the proliferation of Artificial Intelligence (AI) has been nothing short of transformative, reshaping industries and fundamentally altering the way we interact with information.
With the rise of foundational models and declining cost to access hardware resourcess, AI and Machine Learning (ML) are here to stay as seen in the latest AI Bill of rights published by the White House. Academically, AI-powered virtual tutors and educational chatbots provide round-the-clock assistance, democratizing access to quality education regardless of geographical location or socioeconomic status.
The way I think about it is that the biggest win from AI is personalization. AI tailors education to each student's needs, making learning more enchanting and more accessible. And if you have watched “Harry Potter and the Prisoner of Azkaban '', AI is like Hermoine’s Time Turner allowing her to be in two classes at once and to personalize her learning to her own capacity and need.
Large Language Models (LLMs) are very large deep-learning models pre-trained on huge corpora of data to understand the relationship between words and phrases. They are built on top of the concept of deep neural networks, specifically transformer models (introduced in 2017).
Simply put, the concept of depth is what enables the processing of large amounts of data and allows a machine to extract information from larger context and process entire sequences in parallel, unlike earlier recurrent neural networks that process inputs sequentially.
The uniqueness of transformer models comes from the introduction of self-attention mechanisms to understand relationships between words and phrases in the famous paper “Attention is all you need”.
To make this a bit more relatable, our brains process heard or written words very similarly. We encode them into signals that our brain understands, process the words, phrases and context, build inferences and then generate a response in the form of a neural signal that is decoded to spoken or written formats.
LLMs also have an encoder and a decoder. The biggest innovations come from its ability to retain context for longer attention spans and the increase in its size (in order of billions of hyperparameters - similar to our neuro connectors) allowing it to be trained on large amounts of data (approximately the size of the internet) and hence it can be used in more complex tasks like generation, summarization, and many others.
With this initial understanding of how LLMs work, one can infer that the quality of the content generated from an LLM depends (just like a child’s early brain development) on the data it is trained on.
One of the best leveraged tools so far for LLMs is Reinforcement Learning with Human Feedback (RLHF), a powerful approach in AI training where algorithms learn and adapt based on guidance from human interactions. This is more adopted now because generative LLMs (like ChatGPT) are easier to interact with than previous models.
Let’s return to the example of humans and how we learn! When we are young, we take actions and our teachers give us feedback. We process that feedback and (hopefully) learn from it. The better our teachers (i.e., the feedback providers), the better we learn. If that makes sense, you basically understand RLHF!
RLHF ensures that these systems continuously improve their language generation abilities while being steered by human feedback to prioritize ethical considerations, safety, and quality of output. This iterative process enables LLMs to refine their understanding of language nuances, mitigate biases, and generate more accurate and contextually appropriate responses.
When we think about leveraging LLMs for children’s content generation, we believe the quality of these content engines hinge on training them with feedback from people we already entrust with our kids like teachers, therapists, child psychologists etc. This is why when Nookly built its content engine, it had two main strategies:
Our LLM basically studied the core concepts of child psychology and social emotional learning to encourage imaginative play and exploration through interactive storytelling experiences. And just like a wizard honing their skills at Hogwarts, our LLMs refine their abilities with each interaction, and our professionals are the order of the phoenix.
The concept of generative image//video models is not new. You probably interact with them more than you realized - whether you have seen Obama’s deep fake video or have been using the most recent TikTok filters.
The technology initially relied on an underlying deep learning architecture called Generative Adversarial Networks (GANs) where basically there is a generator model and a discriminator model.
The generator starts by generating random images and showing it to the discriminator which compares it to its training dataset then decides if it's real or fake. It then returns this result to the generator to teach it to adjust its drawing until it can no longer discern whether it is real or fake.
The problem with the GAN architecture is that the generator can only be as diverse as the dataset being shown to the discriminator. In other (more nerdy) words, it follows the distribution of the original dataset. 🤓
The recent revolution in image generation arose from the introduction of the diffusion model in 2015 by researchers from Stanford and UC Berkeley (coming originally from statistical physics). In diffusion models, we deliberately add controlled noise to explore possibilities, then use denoising techniques to filter out unnecessary information and focus on what's important. It's a way of helping the model see through the clutter to find the most relevant data.
The idea here is by showing the model a dataset to which it can introduce noise, represent it in a latent space then denoise, it can learn the composition of images and generalize to create other images. For text-to-image generation, the idea is to link those latent space representations to encoded word meanings.
By learning from large datasets of images, these models develop the ability to generate new and diverse visual content, ranging from lifelike portraits to imaginative landscapes. Image generation models find applications in various fields, including art, design, entertainment, and even scientific research, offering limitless possibilities for creativity and innovation.
Visual learning plays a vital role in early childhood education, offering a dynamic pathway for children to develop essential skills such as empathy, perspective-taking, and social awareness (more on that in our next blog post). Through the power of visual storytelling, children are immersed in diverse representations and scenarios that encourage them to understand and appreciate the experiences of others.
Afterall, children’s books and cartoons have been an essential part of the upbringing of many children around the world. We see Generative AI as a way to normalize the proliferation of those visual tools for all kids and to make it more accessible for parents to create those visuals on the spot just with the flick of a wand (or in our case the press of a button).
By harnessing the immersive nature of visual storytelling and introducing personalized characters for each child, parents and educators can create enriching learning environments that empower children to become empathetic and socially conscious individuals from an early age.
“Social and emotional learning (SEL) as an integral part of education and human development. SEL is the process through which all young people and adults acquire and apply the knowledge, skills, and attitudes to develop healthy identities, manage emotions and achieve personal and collective goals, feel and show empathy for others, establish and maintain supportive relationships, and make responsible and caring decisions.”(Read more here) You can read more about SEL and early childhood education in our blog posts: “What is a social story and why do they matter?” and “Guide to hard conversations with your kid.”
At Nookly, we believe personalized and gamified social emotional learning content can facilitate deeper connections and understanding between parents and children. We envision a future where a caregiver can set behavioral goals for their kids and Nookly utilizing its in-depth understanding of how to use storytelling to teach kids how to model good behavior and foster Emotional Intelligence; supporting the development of self-awareness, self-regulation, and empathy through AI-supported activities.
More importantly, research has shown representation and personalization increase learning efficacy by 11%. We believe generative AI can be leveraged to create inclusive learning environments and promote cultural understanding and acceptance. Imagine that instead of consuming cocomelon a child can watch low-stimulating content that teaches them patience, bravery or how to be a good sibling where they can see themselves, their families and their familial values represented and celebrated!
Finally, we believe in empowering parents. We believe parents are the unsung heroes!
AI can provide parents with valuable tools and resources to navigate the complexities of social and emotional development in children.
While acknowledging the necessity of screen time for children to stay abreast of essential technological literacy skills crucial for survival in modern society, it is imperative to limit its duration and ensure that the content accessed is trusted by parents and contributes positively to the child's growth and development.
This is why we believe that exercising parental control on content could help parents feel ease of mind when providing kids with the inevitable access to screens. You can help them advance their growth and still get your well-earned downtime.
As AI becomes a parenting ally, it offers tools to support and empower parents. From offering personalized advice on child development to facilitating household management, AI can alleviate parental stress and enhance family well-being. However, this technological integration must be accompanied by rigorous safeguards to address ethical concerns and ensure equitable access for all.
The main concern that comes to mind is privacy! ML was generally built to understand patterns in data. With larger models, the need for larger amounts of data becomes eminent to ensure the model is not undertrained. This comes with the added burden of ensuring proper encryption and transparent data management - a rising conversation in the space.
With children's data, we are even more on guard. We believe parents should have granular control over the information shared with AI systems, ensuring that their privacy preferences are respected at all times.
With new foundational models and the ability to finetune model variations for each child, we can now make sure each child’s data only enhances the models they use [basically they have their data live in silos] this could further protect user privacy and also help ensure personalization even further.
Addressing algorithmic bias is another imperative. Developers must prioritize diversity and inclusivity in dataset collection and model training to mitigate the risk of biased outcomes. For decades now, researchers have studied how to eliminate algorithmic bias from machine learning algorithms.
Some of the efforts included ensuring balanced data collection to ensure equitable gender, racial, national, political, religious, socio-economic and linguistic representation in collected data. Others included using synthetic data generation to counterbalance collected data biases. However, in many situations, especially for data collected from live platforms like social media or pre-existing corpuses of data, the data comes loaded with prejudiced human biases.
The challenge becomes even more eminent with generative models because they build on top of these inherent biases and generate more extreme content. This is why we believe in the importance of reinforcement learning as the first line of defense. Moreover, democratizing access to AI-powered parenting tools is crucial for fostering equity and inclusivity. This entails providing affordable options and designing user interfaces that cater to diverse socioeconomic backgrounds and technological literacy levels.
In today's world, AI is not only shaping how we learn, but it's also teaching us something important: the value of human connection and empathy. Just like Lily protected Harry through the Power of Love, we only see AI as a tool in your parenting toolbox that can prepare children for a future in which AI will result in more value being placed on the aspects of what makes us more unique as humans, our ability to connect, empathize and wander.