Apple and CMU Researchers Unveil the Never-ending UI Learner: Revolutionizing App Accessibility Through Continuous Machine Learning

https://browse.arxiv.org/pdf/2308.08726.pdf

Screenshot 2023-10-15 at 8.03.36 AM — https://browse.arxiv.org/pdf/2308.08726.pdf

Machine learning is becoming increasingly integrated across a wide range of fields. Its widespread use extends to all industries, including the world of user interfaces (UIs), where it is crucial for anticipating semantic data. This application not only improves accessibility and simplifies testing but also helps automate UI-related tasks, resulting in more streamlined and effective applications.

Currently, many models mainly rely on datasets of static screenshots that humans have rated. But this approach is expensive and exposes unanticipated inclinations toward mistakes in some activities. Because they cannot interact with the UI element in the live app to confirm their conclusions, human annotators must depend solely on visual clues when evaluating if a UI element is tappable from a snapshot.

🔥 Join the fastest growing AI research Newsletter Now!

Despite the drawbacks of using datasets that only record fixed snapshots of mobile application views, they are expensive to use and maintain. However, due to their abundance of data, these datasets continue to be invaluable for training Deep Neural Networks (DNNs).

Consequently, Apple researchers have developed the Never-Ending UI Learner AI system in collaboration with Carnegie Mellon University. This system interacts continually with actual mobile applications, allowing it to continuously improve its understanding of UI design patterns and new trends. It autonomously downloads apps from app stores for mobile devices and thoroughly investigates each one to find fresh and difficult training scenarios.

The Never-Ending UI Learner has explored over 5,000 device hours so far, performing more than 500,000 actions across 6,000 apps. Due to this prolonged interaction, three different computer vision models will be trained: one for predicting tappability, another for predicting draggability, and a third for determining screen similarity.

It performs numerous interactions, such as taps and swipes, on components inside the user interface of each app during this research. The researchers emphasize that it classifies UI elements using designed heuristics, identifying characteristics like whether a button may be touched or an image can be moved.

With the help of the collected data, models that forecast the tappability and draggability of UI elements and the similarity of seen screens are trained. The end-to-end procedure does not require any more human-labeled examples, even if the process can begin with a model trained on human-labeled data.

The researchers emphasized that this method of actively investigating apps has a benefit. It assists the machine in identifying challenging circumstances that typical human-labeled datasets could overlook. Occasionally, people may not notice everything that can be touched on a screen because the images aren’t always very clear. However, the crawler can tap on items and immediately watch what happens, providing clearer and better information.

The researchers demonstrated how models trained on this data improve over time, with tappability prediction reaching 86% accuracy after five training rounds.

The researchers highlighted that applications focused on accessibility repairs might benefit from more frequent updates to catch subtle changes. On the flip side, longer intervals allowing the accumulation of more significant UI changes could be preferable for tasks like summarizing or mining design patterns. Figuring out the best schedules for retraining and updates will require further research.

This work emphasizes the possibility of never-ending learning, enabling systems to adapt and advance as they take in more data continuously. While the current system focuses on modeling simple semantics like tappability, Apple hopes to apply similar principles to learn more sophisticated representations of mobile UIs and interaction patterns.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

Rachit Ranjan is a consulting intern at MarktechPost . He is currently pursuing his B.Tech from Indian Institute of Technology(IIT) Patna . He is actively shaping his career in the field of Artificial Intelligence and Data Science and is passionate and dedicated for exploring these fields.

▶️ Now Watch AI Research Updates On Our Youtube Channel [Watch Now]

READ SOURCE