In this Cycle Time, we will examine an innovative application of DAM to English language learning and on-demand multilingual phrase making using mobile internet devices, smart phones, VoIP and instant workspaces. In particular, we will discuss how two new types of digital assets — networked learning objects and speech acts — can enable a very efficient, low-cost and "disruptive" service with worldwide use cases. Full disclosure also requires that I reveal that am involved with this start-up as their DAM strategist.
Let's start with how adults learn a second language — English. Extensive research in cognitive sciences, learning theory and skills acquisitions indicate that adults learn best in an environment with the following factors:
- Clear, succinct goal(s) for learning a new skill or language, such as "I want help my children learn in school" or "I want to make more money."
- Participation in small groups of peers of comparable skill or language attainment, reinforcing a "level playing field" and a sense of "fair play."
- Peer-led tutors with no official or institutional powers of authority, thereby emphasizing the freedom to make mistakes without fear of punishment as well as an empathic connection among members of a learning pod or group.
- Shared group processes using the same tools and procedures.
- Transparent validation method of one's true or verifiable skill level, such as questions answered correctly.
- Immediate rewards for success and progress.
- Recognition among peers of progress or attainment.
One might take note that learning a new complex technical (DAM?) system entails these same factors! Now, learning a new language starts with basic communication:
- Listening to others and understanding.
- Speaking and others understanding you.
Basic conversational English (or Spanish, or Farsi) entails the listening and saying of about 400 or so sight words — and about 3,000 or so audio or animation assets.
With basic communication in place, basic literacy follows with
- Reading and understanding.
- Writing and others understanding what you have written.
Figure 1 depicts a DAM repository with four classes of digital assets, calling attention to a workflow and the creation of new language learning and phrase-making assets in an international field operation.
Class 1 assets comprise raw audio or video clips with human voice; just digital files with little or no conditioning or metadata. In some cases, these files represent digital audio recording from a mobile headset. In other cases, these files represent the intention development of a library, such as the 400 or so sight-words of basic conversation. These Class 1 assets also remind us that DAM constitutes a living, organic system that should grow, evolve or mature with user success.
Class 2 assets consist of edited or curated set of words and phrases that may include a regional dialect. More importantly, the DAMster or translation group will compile appropriate audio, video and animation clips, matching the tonality, cadence, rhythm and volumes.
Class 3 assets consist of what we call completed speech acts and networked learning objects. Here, completed speech acts constitute prerecorded phrases suitable for on-demand play out as well as normalized files and metadata, using the SCORM metadata schema as a baseline. Networked learning objects consist of a superset of metadata, emphasizing specific learning environments, skill levels, testing and measurement criteria, etc. In all cases, Class 3 assets entail extensive collaboration within a development team.
Class 4 assets consist of finished speech-act packages, defining a Flash file optimized for delivery over a particular type of mobile or web network as well as unique identifiers.
Figure 2 depicts additional details of finished speech-act packages, calling attention to three more dimensions of a digital supply chain for language learning and on-demand phrase making with a library of prerecorded phrases and the ability to stream them at will:
- Four user classes that span governmental agencies and non-governmental agencies (NGOs).
- Mission libraries that span range of social interactions and situations.
- Metadata that expands to include geospatial data, rights, and status of use.
Figure 2 depicts a speech-act package that would enable just about anyone with a mobile internet-connected communications device to "say" something to another person, more or less in real time, in just about any language and have that other person understand. While in this scenario, it remains a one-way phrase making that uses canned or recorded phrases, with new speech recognition technologies, we can envision the day of a universal communicator. Yep, Beam me up, Scottie!





