During a recent office hours session 2/22/24, David Holz, the visionary founder of Midjourney AI, delved deep into the future of the platform, sparking anticipation and discussions within the tech community. David’s discourse was filled with contemplations on upcoming updates, the intricacies of developing AI models, and the philosophical underpinnings of technological advancements.


V6 Updates:

  • Enhanced Character Consistency: Aiming to achieve more reliable and coherent character representations in generated images.
  • Ethical Moderation System: Introduction of a new moderation system with stricter rules to ensure responsible use of AI-generated content.
  • Aesthetic and Performance Improvements: Efforts to enhance the aesthetic appeal and performance of the V6 model, addressing challenges in improving upon an already high standard of aesthetics.
  • Stability and Usability Enhancements: Including making in-painting features stable and improving the overall user experience.
  • Accurate ‘Describe’ Feature: A more accurate ‘describe’ function is being developed to enhance user interactions and preciseness in image descriptions.


Anticipated V7 Features:

  • Significant Quality Leap: V7 is anticipated to offer a major advancement in image quality and model capabilities, surpassing previous versions with substantial improvements.
  • High Performance and Extensibility: Promises of a cleaner, more extensible system architecture capable of reaching much higher performance ceilings.
  • Improved Speed and Character Style Work: Expected enhancements in processing speed and sophisticated character and style rendering for greater consistency and realism.


Video Model Considerations:

  • V6 vs. V7 Video Model Debate: Ongoing discussions about whether to release a video model based on V6 or wait for the advancements in V7, reflecting strategic considerations of quality and timing.
  • Potential for High-Quality Video Generation: If developed, a video model based on V6 or V7 would uphold Midjourney’s standards for aesthetic quality, potentially setting new benchmarks for AI-generated video content.
  • Exploration into Video Capabilities: Indicates an interest in offering video generation features, ensuring these capabilities align with Midjourney’s quality standards.


Additional Updates:

  • Collaborative Ventures with Other AI Labs: Exploring potential collaborations to foster innovation and a cooperative environment in the AI research space.
  • External API Introduction: Potential offering of API access, opening new avenues for creativity and integration with other platforms.
  • 3D Capabilities: Consideration of expanding into 3D modeling and rendering, broadening the creative toolkit available to users.
  • Mobile Experience Improvements: Acknowledgment of the need to enhance the mobile user experience, making the platform more accessible.
  • Ethical and Practical Challenges: Navigating the complexities of content moderation, API access, and collaborations with startups with a thoughtful and responsible approach.


Midjourney Video Model V6 vs. V7: A Strategic Crossroad

The talk of the hour was the debate over the release of a V6 video model versus waiting for the more refined V7. David highlighted the crucial decision-making process, emphasizing the inherent limitations of the V6 video model despite its promising capabilities. The dilemma revolves around whether releasing V6 could detract from the focus on the superior V7, which promises a much higher performance ceiling and a more extensible system. This strategic patience underlines Midjourney’s commitment to excellence, opting to refine their offerings rather than rush incomplete solutions to market.


Enhancing Speed, Character Consistency, and Aesthetics

David’s discussion went beyond the video models, touching upon the imminent improvements in speed and character consistency that users can soon expect. These enhancements are part of Midjourney’s broader goal to elevate the accuracy, coherence, and overall aesthetic appeal of its outputs. However, achieving a significant aesthetic bump in V6 presents challenges, given the already high level of quality. Midjourney is exploring innovative solutions, including a potential style tuner, to push the boundaries of what’s possible in digital creativity.

On the horizon for Midjourney are several technological enhancements designed to elevate the creative process. The introduction of a new, more accurate ‘/describe’ feature promises to enhance user experience significantly. Additionally, discussions about improving character consistency and exploring 3D capabilities hint at a future where Midjourney’s tools become even more versatile and powerful, catering to a broad spectrum of creative needs.


Embracing Collaborative Growth

Midjourney AI stands on the brink of transformative collaborations with other AI labs, marking a pivotal shift towards a more unified approach in the AI research community. As stated by David, “We might be collaborating with some other AI labs over the next year… exploring mutually beneficial stuff between the different AI labs rather than everybody being competitive.” This collaborative spirit underscores a shared commitment to leveraging AI for the greater good, transcending competitive barriers to foster innovation that benefits all.


Simplification Amid Complexity

A recurring theme in David’s narrative was the balance between introducing advanced features and maintaining user accessibility. Midjourney’s platform is becoming increasingly sophisticated, raising concerns about usability. David expressed a desire to simplify the user experience before adding complexity, ensuring that new features like character and style references are genuinely beneficial and widely adopted, rather than overwhelming users with options.


Ethical Moderation and Future Directions

David also touched on the ethical considerations of AI-generated content, particularly the moderation of character consistency. Midjourney aims to enforce stricter rules to prevent misuse, ensuring that its technology fosters creativity without crossing moral boundaries. This careful approach extends to potential partnerships with other AI labs and the cautious exploration of APIs for broader integration, signaling Midjourney’s responsible stance towards its influential technology.


Investing in the Future: Medicine and Beyond

David’s interest in the medical field underscores a broader ambition to leverage Midjourney’s success into sectors with profound impacts on humanity. He sees the next decade as pivotal for medical advancements, driven by AI and technology. “We need more money and more time. It’s definitely on the list,” David mentioned, acknowledging the intricate challenges and substantial resources required to make significant inroads into medical innovation. His aspiration to contribute to this field reflects a commitment to using Midjourney’s capabilities for societal benefit, beyond the realms of creativity and art.


Rethinking Business Models: User Experience over Revenue

In a decisive move away from conventional revenue models, David highlighted Midjourney’s stance against a pay-per-prompt system. Such a model, he argued, could heighten user anxiety, detracting from the creative exploration that Midjourney aims to foster. This perspective reveals a deeper philosophy within Midjourney: prioritizing user experience and creative freedom over immediate financial gains. It’s a testament to the company’s user-centric approach, ensuring that financial considerations do not hinder the creative process.


The Financial Dynamics of ‘Relax Mode’

Exploring the complexities of Midjourney’s ‘relax mode,’ it becomes evident how this feature, initially designed to leverage unused computational power, has transformed. It now offers users a unique opportunity to interact with the AI in an unrestricted, more leisurely manner, fostering an environment conducive to creative exploration. Yet, this innovation comes with its own set of financial challenges for the company. Operating at a loss, ‘relax mode’ represents a significant consumption of computational resources that could alternatively support the platform’s broader user base, particularly those with active subscriptions.

The crux of the issue lies in the limited nature of these computational resources. By allowing users to extensively use ‘relax mode,’ Midjourney inadvertently reduces the availability of these resources for other users who might need them within the same billing cycle. This scenario has prompted a thoughtful examination within Midjourney of how best to balance user experience with the practical limitations of resource allocation.

In facing these financial pressures, Midjourney is contemplating various strategies to maintain the viability of ‘relax mode.’ These include potentially decelerating the pace of the mode or altering its cost structure for users, with the aim of reducing the financial burden on the company while still retaining the mode’s core appeal. Such measures indicate Midjourney’s ongoing commitment to finding sustainable solutions that preserve the innovative spirit of ‘relax mode’ without compromising the platform’s operational efficiency.


Looking Ahead with Optimism

Despite the uncertainties and technical challenges, David’s outlook remains optimistic. The anticipation for the V7 model is palpable, with promises of significant improvements over its predecessors. Midjourney’s journey is one of careful planning, ethical consideration, and unwavering dedication to pushing the frontiers of AI artistry.

As Midjourney continues to evolve, its community eagerly awaits the realization of these ambitious projects. With a blend of strategic patience, innovative thinking, and ethical responsibility, Midjourney is poised to redefine the landscape of AI-assisted creativity, making the future of digital artistry brighter and more accessible than ever before.