GenAI video gets even better with Midjourney

Two years ago I generated the first scene in my first novel in midjourney, as a static image, then got the Runway app to bring Roisin to life, this was the result. The post about it was here and later on on keeping a consistent character in the still image generation here

It was good and somewhat amazing but this week midjourney added a powerful new video generating model, and the same image gave this even more stunning result.

The new one has much more consistency and intricate details, down in the bottom right a silver zip toggle sways as she types. The music was not AI just added on a youtube short. Midjourney has the stated goal of getting to generate worlds i.e. metaverse. This seems to be at least as good as Google’s Veo3 that we just got here. It’s much harder to spot the progress in the LLMs, but this is a good visual indication of the speed and improvement 

Also the flying penguins get a look in too. It’s been tricky to get penguins to fly so far but Google Veo3 did this one, the desert island with a soundtrack generated too. Hard to see they are penguins but they are 🙂

Whilst that looked really good the ultimate one came from midjourney video again. Look at this 🙂 (Sound added from YouTube short editor

Another metaverse explainer layer

Just before the holidays I posted an experiment video built with genAI tools to explain the evolution of the metaverse that may not always be obvious, as people often wait for a big bang of a product. Instead we see the ever increasing digital transition of our real time interactions, just as we have done with maps to GPS location to full GPS with traffic and route finding. Which is itself s digital twin of the world that many of us interact with daily in our cars and on our phones. That video is in this post.

Following on from that is this one that shows further things that can be done with the very same assets that made a sequential video, but now split into a presentation layout using virtual space. Plus a little bit about how dynamic virtual worlds can be. The primary message is that a GenAi video in a virtual world is not that far from being an entire virtual world to explore itself. We explore 3d data all the time ourselves in games, and in mapping too. Sometimes we do need to be spoon fed content, as in a video or a ppt deck, but other times it’s better to look and experience at your own pace. This virtual world concept for presentations is not a new one. Back in 2009 I wrote about trying a different presentation style laying out panels in Second Life, and also used it for rehearsing my first of many TV slots too in 2010 this one on 3D printers. Now what has become easier is the creation of the content in the first place. Well, I say easier, it’s different, still lots of trial and error and you need a bit of a vision for what you are trying to do.

I used my spatial.io account for this, but also to extend tech experiment I used a custom deployment through Unity to see how it all worked from that point of view too. So this is a mix of native spatial.io tools and their base world items, and some extras that I pumped int through Unity. (They have recently changed their licensing and access to not posting the spatial link as it may need some work 🙂 )

Reusing metaverse content in the metaverse to describe the metaverse….

Talking with an AI of Roisin from my novels

Achievement unlocked. I just ran a local only version of a #genAI LLM and gave it the text of my 2 sci-fi novels Reconfigure and Cont3xt. Amongst other things I have had a conversation with my lead character Roisin!. Separately (as in the photo below) I also asked if the books helped solve if we live in a simulation. A great muse to chat about the potential of the 3rd book. reconfigurebook.co.uk Not perfect, but I can let it know what it’s not got quite right and also helps me remember the intense process of writing them in 2015, as they flowed onto the page like binge watching a boxed set.

I used the https://www.nomic.ai/gpt4all and just added one of Llama models to it. Giving it a fresh direct with a copy of the book PDFs was enough to get going. This is a MBP M2 chip machine, but there was no delay in having a conversation or diving right into the text or the personality of Roisin. Words are obviously much quicker to process than generating images or video.

AI Roisin picked up on some of her mannerisms in the books and played heavily on the various situations she has encountered. A lot of the book is about her inner voice and intentions so genAI had a lot to go on.

The wider world of the books, the tech and the philosophical elements of the story is something that it was not always getting quite right. Things changed a bit when it said it was trying not to generate spoilers, and I pointed out I wrote it, so the LLM changed tone and intention a little. It was the usual thing of asking it to describe something is it tells you x and y but not z. You mention z and it’s “sorry my mistake yes you are right”. However for a scoot through the lore, the background, some of the other characters this is all good. People may have read the book and got a different feel for something so it’s good to not treat it too rigidly.

My favourite part was when Roisin switched to whispering some extra details about something, an out loud statement followed by a psst…. listen type of moment. I have only had a few tech powered moments of that impact ever.

As with my previous renders of Roisin as an image and a video I am looking forward the ongoing evolution of this so I can hang out in a metaverse version of my created world and characters.

A short video about Metaverse

I decided to create a very short metaverse evolution explainer video, but using GenAI. It is based on looking at patterns, we have all got used to moving from paper maps to digital ones, with GPS and now fully instrumented digital twins with traffic and other information. That leap applies to many other use cases. It’s all metaverse – digital to physical and back again, people, machines and data. All on my own midjourney.com, runway.ml and luma.ai accounts. Also learn a lot more about how hard it can be to wrangle AI to what you really want, but it works 🙂

Metaverse evolution

What was mad about this was I generated my key images in midjourney and gave a couple of goes at runway that I was happy with (also spliced together and the talking soundtrack is runway), but a day after I had it where I wanted Luma.ai went live and I gave that a go. For a few of the scenes it was just much more what I needed. There is a point where you just have to hit publish, but these things just keep improving as a base tool, let alone the skills also improving to ask for the right thing. It is very much a creative process even if the mantra is often AI is taking over.

A presentation about everything

I recently gave a BCS presentation online for the Animation and Games specialist group and anyone else who wanted to come along where I kind of took some of the individual subjects that I have been engaged with in emerging technology and tried to describe them all in context with one another. It was a bit of a mad thing to try and do, but I also looked at how we might all be able to understand some of the technology advances by cutting through the jargon. It got a moderately philosophical with both fractal thinking and bringing in ying/yang concepts that my brain is pondering more due to learning Tai Chi. Quite a combination?

The presentation is now available on YouTube if you want to take a dive into “IoT, 5g, AI/GenAI, Cloud streaming, Edge computing, metaverse, Spatial computing, AR, VR, XR, quantum computing, industry 4.0/5.0, Brain computer interfaces (BCI), CRISPR, open source, crypto, Web 3.0 (the list goes on)”

The full BCS page with all the blurb and a download of the deck (minus the video element) is here

Warning, also has adverts for my books Reconfigure and Cont3xt, also related to all these concepts 🙂

Why talk to an industry analyst?

As an industry analyst at a well known company (i.e. this is not about feeding edge as it is just a place to blog and hold the rights for my novels now) covering all things Metaverse, industrial, enterprise and consumer, I know there are few newer companies in the space that may not know what being an analyst means to them, so here is an updated and repurposed blog post I wrote a few years ago else where about what an industry analyst does and why you might want to talk to us. (about any of our coverage areas not just metaverse). Here is the who, what, where and when raison d’être for industry analysts as I see it.

 

Midjourney Cyberpunk Me

Who?

I been an industry analyst for nearly 8 years, but I have been in the emerging technology industry for well over 30 years as a software engineer and architect and as a writer and presenter. It is not uncommon for an industry analyst to have a lot of field experience, I used to brief analysts during my corporate life as IBM too. Equally many analysts are from a journalistic or statistics background, trained in finding and sharing facts and figures. Of course, there are many personality types within the profession overall, though in general the analyst profession is a people based one developing contacts and relationships across industry areas. Analysts have to take on board a lot of information but cut through the marketing hype to find patterns and facts based on their experience. Not all analysts are going to be long in the tooth like me, but it often helps. In the case of this metaverse wave having been what might now be called an influencer of the 2006 enterprise virtual worlds wave as a metaverse evangelist, putting Wimbledon into second life and developing solutions in the pre IoT connectivity days of things with the then new MQTT, not to mention early days of many current technology trends of web, e-commerce, social media, blogging and personally in gaming. My research agenda primarily covers emerging technology, where everything old is new again. An important thing to add, and certainly one that is true of my group is that analysts maintain integrity and impartiality by being separated from the commercial side of the company, though we do do commercial work, we are not do pay for play where I work. The content of what we cover and say is not based on how much clients pay, nor is it is based on how much the analyst relations department spends on a dinner. Everyone has their own motivation and connection to a subject but, impartiality, trust in treating off the record discussion and, building a reputation is core for our analysts.

What?

Large enterprises and the smallest of start-ups, and everyone in between, share a need to understand the market they are currently in, or look to move to. What is the competition up to? What is resonating in a market? What is going to totally disrupt the current plan? Start-ups need attention and connections to raise funding, enterprises need ongoing growth for shareholders and investors. Analyst companies are there to help provide answers and perspective on these sorts of questions. A large corporate entity may well have a competitive analysis department, but they will be focussed on the other big companies, less the quirky start-up set to make a dent in the industry. Smaller companies are busy just trying to do what they do, and may not have the time to look up and see what is going on around them. Analysts are always across a spectrum of companies, sizes and industry types. Whether it is writing regular short form reports on companies and their latest services, longer form reports across an entire industry sector, running ongoing surveys across thousands of industry types for directional data, custom consulting work, webinars, presentations, offering Merger and Acquisition (M&A) due diligence or just a quick call to answer a client question, analysts offer their considered opinions, backed by experience and data.

One of the other things that is important to consider, especially as a start-up, is to be in the minds of relevant analysts. We talk to people all the time, and suddenly a subject might come up, it may be a complete tangent, which is where we pattern match to say “I saw this really interesting approach from…”. Those conversations might be with VC’s, with companies looking for partners or with potential customers of the company.

Where?

The actual answer to this is anywhere. We take briefings from companies, typically for around 30mins – 1 hour on the phone, over web video conference, in person at trade shows. We try to ensure we do talk to companies, not just look at the web or a press release, as I mentioned above this is a people business. Trade shows such as the huge MWC  bread and butter to analysts and for me Hannover Messe and Augmented World Expo are also important. By Example my experience of MWC in 2019, was 30-minute meetings scheduled with 30 mins walking time between them for 3 days, I logged 10 miles a day walking just in the conference centre. That is a lot of conversations and presentations from a varied set of companies. 

Social media has always been a useful place for me personally and professionally since the early days, and analysts are often to be found there now. I am always willing to hear about interesting things on twitter/X as @epredator, on linkedin https://www.linkedin.com/in/epredator/ (and now mastodon and many other places).

When?

Anytime you need to know something across your industry, or you need the industry to know about something you are doing or about to do that’s when you can benefit from an analyst company, or multiple analysts’ perspective. A start-up may be in stealth, not ready to announce yet, and that is where analyst’s integrity is key, tell us what you are doing off the record and that may well lead to some suggestions that help or when you come out of stealth a better description from us as we share what you are doing.

Whilst we are separated from the commercial side of things, we are aware that what we produce is the core product for the company. Companies pay to access the basic reports, long form reports and detailed survey data is also charged for at a different rate, as is our time on the phone answering questions or doing custom consulting. However, companies telling us things is not something that anyone pays to do, and, in those conversations, we do often share our thoughts, so you can get something from a briefing, no need to just broadcast what you do. A final tip, if you are publicly known, have as much information as possible easily accessible, in a slide deck, or on a website. We all take notes during conversations but being able to look things up after the event is important. Who is the company, where were they founded, not just how cool the product is has great importance, this is a people business.

A new martial arts journey – Tai Chi

I have been training and teaching Choi Kwang Do as a defensive martial art and health improving activity for many years. I was very amazed to achieve my first black belt in it at the end of 2014. That journey was shared with all the family too for a long while. We all trained, taught and graded together for many years.

In 2019 I was getting ready for my 3rd Dan grading when in a non related incident I managed to get a concussion slipping over in the kitchen making Easter dinner. That made me ease up on training for a good while, as I recovered. On getting back to full time training and classes at the end of 2019 the world took its own hit as we headed into the COVID lockdown years. I continued to train at home though, my various CDK routines and a friendly BOB for impact training became a regular part of life. I wasn’t so keen on zoom call training as all work was zoom based and I wanted to get away from that.

A few other health issues also slowed the training down, a bought of COVID hit me with some longer term balance and focus issues on and off for a year or too. As part of trying to sort this out there were a few doctors and ENT specialists but it was all a bit off the normal diagnostic path, despite being utterly horrible and impactful at times. The result was I thought I would try Chinese traditional acupuncture and sort my Qi (Chi) out. I started this a year ago now. In the first few months the impact was significantly positive, and it remains so. I was almost completely sorted out by September last year but over new year I ended up with flu/covid whatever and it kicked me back to where I had started. One trip to acupuncture and I was well on the road to normality again. Also during the course of the sessions other things, such as training muscle injuries get rolled into the treatment and recovery is way quicker, also my hay fever has diminished to almost nothing. This got me considering what else I could do to maintain and improve this balance of energy. BTW the place I get treatment is https://www.physicalbalance.com in Basingstoke with Carolyn, who is always fully booked up with patients from far and wide.

At the same time as this was happening our local CKD school was winding down as our school owner had a very exciting family opportunity to head to Australia. A few years ago I would have jumped at the chance to take over the school and continue, but I didn’t feel in a position to be able to do that. As everything happens for a reason, the acupuncture and my introduction into these energy flows got me looking around for Chinese martial arts in the area. At the same time my daughter had looked into trying something out as a Christmas present for me. She got me and her trial lessons in Tai Chi at Shin-Gi-Tai martial arts in Basingstoke and the odd serendipitous part of this is that its just a few hundred meters from where my acupuncture is, but not related in any way else.

We went for our Tai Chi lessons and it was really enjoyable, in a great martial arts facility with an incredibly experience team of instructors in all forms of martial arts. I signed straight up in January and have been going a couple of times a week since then. Tai Chi is the very slow moving graceful movements, which exists in many forms alongside QiGong. QiGong is the more health focussed art but Tai Chi is a martial art in that whilst you learn slow meditative and physically beneficial movements they are also offensive and defensive forms. I had not taken up CKD originally to learn to fight, but to learn to defend if need be, but for the mental and physical challenge. Nothing in CKD was about ego and macho competition and so walking into Shin-Gi-Tai this felt very similar. Obviously Tai Chi tends to attract us older people so we are less likely to want to show who is boss etc.. There is a never ending set of tweaks and improvements to be made and patterns to be learned that have more than enough to keep anyone going. I have also found that when I do train at home in CKD still (for a bit of cardio and fast movement) that I try and apply some of what I am starting to learn to feel in Tai Chi.

In case you are not convinced about the potential impact of Tai Chi as a martial art form (just as some are not convinced by other martial arts in a tit for tat mine is better than yours – silly I know). But this video shows a demonstration that at the point it starts art around 11:10 is the basics of the form I have been learning.

Tai Chi Application

Our teachers as you can see here have an incredible history and amount of martial arts experience in so many forms. Bryan often shares how the application of some of the Tai Chi might play out, just as in the video from the US above, which for me helps contextualise it all. Though the real benefits that I am enjoying are building on the energy flows from acupuncture and feeling an improvement in mental and physical well being. It’s great to have a slow flowing set of moves to complement the also flowing and rounded ones of CKD (we have no lockouts in that art).

So there we have it a new martial arts journey, but a complementary one. There are many more things to try at Shin-Gi-Tai too. Lets see where this goes 🙂

Consistent characters in Midjourney GenAI

It’s been a while since I posted anything for lots of reasons, that’s an offline conversation. However, this weekend something appeared on my feeds that was just too exciting not to do a quick test with and then share, which I already have on Twitter, LinkedIn, Mastodon and Facebook plus a bit of instagram. So, many channels to look at!

Back in August 2022 I dived into using Midjourney’s then new GenAI for images from text. It was a magical moment in tech for me of which there have been few over the 50+ years of being a tech geek (34 of those professionally). The constant updates to the GenAI and the potential for creating things digitally, both in 2D, movies and eventually 3D metaverse content has been exciting and interesting but there were a few gaps in the initial creation and control, one of which just got filled.

Midjourney released its character constant reference approach to point a text prompt at a reference image, in particular of a person, and to then use that person as a base for what is generated in the prompt. Normally you ask it to generate an image and try and describe the person in text or by a well known person, but accurately describing someone starts to make for very long prompts. Any gamers who have used avatar builders with millions of possible combinations will appreciate that a simple sentence is not going to get these GenAI’s to get the same person in two goes. This matters if you are trying to tell a narrative, such as, oh I don’t know… a sci fi novel like Reconfigure? I had previously written about the fun of trying to generate a version of Roisin from the book in one scene and passing that to Runway.ml where she came alive. That was just one scene, and to try any others would not have given me the same representation of Roisin in situ.

Midjourney experiments
Initial render in midjourney some months ago

The image above was the one I used to then push to runway.ml to see what would happen, and I was very suprised how well it all worked. However, on Friday I pointed the –cref tag in midjourney to this image and asked for some very basic prompts related to the book. This is what I got

epredator_in_a_computer_room_hacking_--cref_httpss.mj.runBEGdSJ_962db28e-f687-45b3-a6f1-dd3853750dc9
Another hacking version this time in a computer room not a forest
epredator_a_girl_running_from_a_car_--cref_httpss.mj.runBEGdSJ3_5d642f40-d948-4be8-93ae-0e5afc8c8254
A more active shot involving a car
epredator_a_girl_running_from_a_car_--cref_httpss.mj.runBEGdS_d3f394d0-6389-41e6-8050-cfe0d98a6827_1
A different style, running at the car but similar to the previous one
epredator_buying_marmite_in_the_supermarket_--cref_httpss.mj.ru_29a1e69f-65ad-47fe-aaae-794e3e23bbd5
Looking to buy some Marmite (a key attribute in the story and life of Roisin)
epredator_buying_marmite_in_the_supermarket_--cref_httpss.mj._35656c46-5f87-435a-ba91-b4d8c20eef0b_1
Another marmite shopping version
epredator_snowy_scene_with_girl_--cref_httpss.mj.runBEGdSJ3q2_5502223a-4746-4847-8218-ab1d13deea7b_2
A more illustrative style in the snow (sleeves are down)
epredator_snowy_scene_with_girl_--cref_httpss.mj.runBEGdSJ3q2_5502223a-4746-4847-8218-ab1d13deea7b_3
Another snow stylistic approach (notice sleave up and tattoos)
epredator_in_a_computer_room_hacking_--cref_httpss.mj.runBEGd_e3489682-baf7-4265-881e-0ab6aa8a6d95_2
A more videogame/cell shaded look to the computer use in the bunker.

As you can see it is very close to being the same person, same clothing, different styles and these were all very short prompts. With more care, attention and curation these could be made to be even closer to one another. Obviously a bit of uncanny valley may kick in, but as a way to get a storyboard, sources for a genAi movie or create a graphic novel this is great innovation to help assist my own creative process in ways that there is no other way for me to do this without some huge budget from a Netflix series or a larger publisher. When I wrote the books I saw the scenes and pictures vividly in my head, these are not exactly the same, but they will be in the spirit of those.

Metaverse and GenAI webinar for BCS

This month was the AGM for the BCS Animation and Games specialist group that I have been chairing for a very long while now. I gave a presentation from a personal view point (this is not a work presentation and I make that clear in the disclaimers, though it is what I work in too of course), on the advances in Metaverse and GenAI content creation. The full YouTube version is below but the link to the blurb and bio (and the video) at the BCS is here

We are always looking for presenters to come and share some ideas with our specialist group around all things games, animation, metaverse, esports etc, so if you are interested ping me there is a slot waiting for you. We sometimes get a big crowd, other times smaller ones but with the videos published like this it can be a useful thing to do and share.

For those of you who don’t know, BCS (formerly British Computer Society) Chartered Institute for IT is a UK based (but worldwide membership) professional body for anyone in the tech industry. It exists at all levels from just getting going in the business to Fellows with vast amounts of experience and willingness to help. It was part of my professional certification whilst at IBM and I then also became a certifier whilst there too. Volunteering and sharing ideas, such as this presentation, is one of the many ways to get involved (you don’t have to do this). It benefits you as an individual but also elevates tech roles within enterprises and organizations you work in.

You can find more at BCS, The Chartered Institute for IT (bcs.org)