THANK YOU FOR SUBSCRIBING

Crystal Method: The Future of AI


The company is applying deep reinforcement learning to pick up objects by machine control through combining deep learning, which enables autonomous feature extraction from input data, and reinforcement learning, which allows autonomous control in response to external input.
“We are executing various R&D programs to construct a general-purpose type of AI,” states Kei Kawai, President of Crystal Method. The company has been researching sound, 2D, 3D AI and developing abnormal sound judgment, sound source separation, noise suppression, speech synthesis, and multimodal emotion recognition. These technologies are used in various situations. One of the main methods of use is visual inspection at the factory and abnormal noise judgment. Visual inspection detects abnormalities in products and parts from the appearance, and strange noise judgment distinguishes the sound of equipment and products and determines where the abnormality is. “In 3D you could judge the form, color, and time that were not judged together until now,” says Kawai.
The human eye has performed visual inspections, but when the number of inspections is large, personnel costs have increased considerably. In addition, there are many places where the judgment of abnormal noise relies on the intuition and experience of skilled workers cultivated so far. There is an issue that it is challenging to inherit and share that “intuition and experience”. By performing such inspections with Crystal Method’s technologies around AI, it is possible to reduce the cost of inspections and prevent the personalisation of judgments. “In the aspect of abnormal detection programs, we have successfully reduced the cost of detection and made the whole process ten times quicker and efficient,” adds Kawai.
Crystal Method helps its clients quickly initiate AI utilisation and bring continuous business innovation. The firm differs from other companies in its use of a programmable logic controller (PLC) that facilitates introduction at automakers’ plants and provides a mechanism for inspection linked with the main computer at the plant. The company proposes a concrete action plan after examining and analysing a client’s business content and environment in detail for the applicability of AI.
We are executing various r&d programs to construct a general-purpose type of ai
“We will solve your company’s problems in both modal and multimodal ways,” says Kawai. The firm is currently taking all sorts of approaches to the agenda, including machine learning and reinforcement learning for detection of defects and robot introduction at the plants of major automakers, and collaboration with research institutes and university laboratories. It presents its research findings through organisations such as the Japanese Society for Artificial Intelligence and the Information Processing Society of Japan
HAL3
Crystal Method has developed HAL3, an interactive AI with various functions to support office work such as attendance management, reception, and translation. These features are customisable, so users can add only the features they need. HAL3 manages attendance, breaks, and leaving work. By showing their face on the camera, an employee can tell that they are going to work, taking a break, or leaving work. The registered face will be recognised, and the information will be updated automatically, and employers can manage entry/ exit and attendance effectively using this face recognition technology. While functioning as a receptionist, HAL3 recognises a person, and it offers visitation, delivery, and FAQ options to call people in the office and provide visitors with the information they need.
HAL3 can record conversations at meetings. It is possible to log in to one conference room from two devices, and it records in real-time in chronological order while determining which room the statement is in. Since up to 2 users can log in to one room, two users can have a real-time conversation. It removes noise and separates the voices of each person who speaks. When users come across a word they do not understand during the meeting, HAL3 will search the dictionary or Wiki for the term and read the outline as a customisation function. In particular, with the recent progress of online discussions and the ease of meetings with foreigners, people may find it challenging to have meetings in English. HAL3’s translation function can recognise the input voice and translate it in real-time. HAL3 will translate between Japanese, English, and Chinese and read the translated version aloud while adding facial expressions according to the content. “We created a database that saved words and corresponding images as a set, and added technology to output voices and facial expressions that reflect the emotional values of each word when HAL3 reads aloud sentences,” asserts Kawai.
HAL3 is quite effective in emotion recognition from voice or facial expressions, object recognition, scene estimation, among others. HAL3 can read the emotions of the person in the camera. It tells users the feelings they have by recognising their facial expressions. It can also read the emotion of the voice input to the microphone. HAL3 describes users with the highest proportion of the four emotions--joy, anger, sadness, and ordinary. When a person looks at an object on the camera, HAL3 can tell if it is pleasant, unpleasant, or neutral. HAL3 can perform object recognition by analysing the thing placed in front of the camera. Moreover,
it can analyse and depict what the scene in the camera looks like as a part of its scene estimation capabilities.
At the same time, HAL3 is also equipped with a mechanism for health management and work transfer. For instance, HAL3 makes sense to the sick and quickly notice any abnormalities in the medical sector. It is practicable to respond according to the individual condition of each patient. HAL3 can promptly grasp the mental state by recognising emotion, recognising pleasure and discomfort, and pointing out mistakes on the therapist side.
Deep AI Copy
Crystal Method will start providing a service called the DeepAICopy that installs personal appearance, voice, hobbies, thoughts, knowledge, and self-awareness into AI. It is also implemented in the interactive HAL3. Deep AI Copy makes it possible to deeply learn the person’s face image and voice data to be copied and generate an AI that looks exactly like the person. For instance, one can create its own AI that looks exactly alike with just 40 minutes of video recording using Shallow Copy technology. It is applied to the FAQ function that answers prepared answers in real-time and the free talk function that allows users to enjoy conversation freely. It is possible to capture not only appearance but also voice information such as voice. AI can reproduce personal hobbies and tastes by linking each person’s voice and facial expression, feelings such as pleasure/discomfort to things, and learning the difference in hobbies and preferences. It is possible to create an AI that reproduces the individual’s individuality.
"We will solve your company’s problems in both modal and multimodal ways"
AI can generate consciousness by thinking and acting autonomously. From information such as sounds coming from the surroundings, what people are talking about, landscapes and people’s movements, the AI model can change facial expressions and speak based on the context and relationships. For example, when a person detects many people and hears loud music, they mutter “This is a live venue”, and take autonomous thoughts and actions. With the existing learning method, changing facial expressions and high-speed screen drawing did not perform well. Therefore, Deep AI Copy applies the latest of deep learning, and it has also been using the technology of the GPU (graphics card). “Our patented technology has made it possible to present even smoother delivery of facial expressions,” adds Kawai. In addition, Deep AI Copy will be equipped with a “read the air” function in the future. Therefore, it is expected to be used in situations where delicate and complicated communication with people is essential, such as medical and nursing care sites, office receptions, and entertainment where AI plays an active role as a talent or voice actor. Newspapers have successfully reported DeepAICopy, and TV shows and the government also support the project.
What Lies Ahead
Currently, Crystal Method examines services in DeepAICopy, HAL3, sound source separation, abnormal sound, and 2D & 3D appearance inspection. “We are constantly researching the latest AI and making presentations at academic conferences and study groups,” concludes Kawai.
At the same time, HAL3 is also equipped with a mechanism for health management and work transfer. For instance, HAL3 makes sense to the sick and quickly notice any abnormalities in the medical sector. It is practicable to respond according to the individual condition of each patient. HAL3 can promptly grasp the mental state by recognising emotion, recognising pleasure and discomfort, and pointing out mistakes on the therapist side.
Deep AI Copy
Crystal Method will start providing a service called the DeepAICopy that installs personal appearance, voice, hobbies, thoughts, knowledge, and self-awareness into AI. It is also implemented in the interactive HAL3. Deep AI Copy makes it possible to deeply learn the person’s face image and voice data to be copied and generate an AI that looks exactly like the person. For instance, one can create its own AI that looks exactly alike with just 40 minutes of video recording using Shallow Copy technology. It is applied to the FAQ function that answers prepared answers in real-time and the free talk function that allows users to enjoy conversation freely. It is possible to capture not only appearance but also voice information such as voice. AI can reproduce personal hobbies and tastes by linking each person’s voice and facial expression, feelings such as pleasure/discomfort to things, and learning the difference in hobbies and preferences. It is possible to create an AI that reproduces the individual’s individuality.
"We will solve your company’s problems in both modal and multimodal ways"
AI can generate consciousness by thinking and acting autonomously. From information such as sounds coming from the surroundings, what people are talking about, landscapes and people’s movements, the AI model can change facial expressions and speak based on the context and relationships. For example, when a person detects many people and hears loud music, they mutter “This is a live venue”, and take autonomous thoughts and actions. With the existing learning method, changing facial expressions and high-speed screen drawing did not perform well. Therefore, Deep AI Copy applies the latest of deep learning, and it has also been using the technology of the GPU (graphics card). “Our patented technology has made it possible to present even smoother delivery of facial expressions,” adds Kawai. In addition, Deep AI Copy will be equipped with a “read the air” function in the future. Therefore, it is expected to be used in situations where delicate and complicated communication with people is essential, such as medical and nursing care sites, office receptions, and entertainment where AI plays an active role as a talent or voice actor. Newspapers have successfully reported DeepAICopy, and TV shows and the government also support the project.
What Lies Ahead
Currently, Crystal Method examines services in DeepAICopy, HAL3, sound source separation, abnormal sound, and 2D & 3D appearance inspection. “We are constantly researching the latest AI and making presentations at academic conferences and study groups,” concludes Kawai.

I agree We use cookies on this website to enhance your user experience. By clicking any link on this page you are giving your consent for us to set cookies. More info