Interesting fact! 🤓 These step-by-step tutorial scripts and video rendering were made directly in the ProtoText app. This feature is available in version 1.10.2 and above. The speech was generated by a neural network connected to the app through API, for $0 😉

Installation on Mac


Download the app from the official website: https://prototext.app/releases/ProtoText-Mac.zip

Unzip it somewhere on your computer. But it is better to place the file in the Applications directory.

Open the ProtoText.app.


Your computer may block the app upon first launch for security reasons. This is expected!

Prototext has not yet been published on the App store. If you don't trust the developer, then you can check out the source code of the app, or discuss it with our community on Discord.

If everything is okay, just unlock the App using the Control Key.

1. Hold down the Control key.2. Open the context menu (right-click),3. Open the app.4. Confirm the launch.


Installation on Windows


Download the app from the official website: https://prototext.app/releases/ProtoText-Win.zip

Unzip it somewhere on your computer.

Open the "ProtoText.exe" file.


Your computer may block the app upon first launch for security reasons. This is expected!

Prototext has not yet been published on the Microsoft store. If you don't trust the developer, then you can check out the source code of the app, or discuss it with our community on Discord.

If everything is okay, just confirm the launch of the app.


Your first step in the app

You can start your work from:

1. Templates.

2. Recently edited documents.

3. Any other .ptxt document saved on your computer.

4. You can select multiple options to open each document in a new window with a single click.

This is especially comfortable in full-screen mode on systems that support multiple desktops.

One screen for one document. Swipe to switch between documents.

Explore the top menu of the app

This menu contains many functions and information about shortcuts.

For example, use the menu or press CMD+N or CTRL+N to create a new window with the Welcome Screen.

Press CMD+W or CTRL+W to close the active window.

The Main Workspace. Introduction

Key points about the App

The Prototext interface is minimalistic. All your focus on your thoughts, ideas, tasks. Nothing more!

Any document in the app consists of pages.

Pages consist of cards.

You can see and work with two pages at the same time. This app feature is perfect for comparing, structuring, and linking information. It could be the secret to your productivity.

Cards can be: text, files, various links. Small pieces of information with minimal visual formatting.

The cards look like paper sticky notes. They are also compact, colored, and easy to move. Color plays the roles you define.

Otherwise, the workflow in ProtoText is similar to editing a typical digital text document, like in Microsoft Word. Where you just type and have unlimited virtual space.

The ProtoText interface is a non-standard combination of these two practices.

It works well to find order in chaotic data, prioritize certain elements, use the result in a desired intelligent form.

How to work with cards? Editing. Visual formatting. Working with text, files, and links.

Basic editing tricks

All starts with one empty card, in the blank document.

Click on the card, and type something.

Use the Enter key to create new cards, but keep in mind:

When, the cursor is at the end, of the text – A new card will be created below the current one.

When the cursor is, at the beginning, of the text – A new card will be created, above, the current one.

When, the cursor is inside, the text – It will split the current card into two parts.

The backspace key works in a similar way, but to combine, and delete cards.

Sometimes it's very handy to have quick copies of some cards. Hit CMD+D on MacOS or CTRL+D on Windows to get it.

Navigation between cards, can be done not only with the mouse, but also with the arrow keys on the keyboard.

If you hold down the Alt key while doing this, you can swap the cards.

Cards control elements

Every card has hidden control elements. Move the mouse cursor over a card to display these elements.

The first control element – is a handle to select and rearrange cards.

Click on the handle to select a card. Double click – to select all cards. Drag it to rearrange.

The second control element – is a button to delete a card.

The third control element called "Custom Card Actions". This is a powerful feature where you have the opportunity, to connect and use any local or remote, paid or free, digital services to the app.

This can be generation or processing of text, images, audio, video, tasks of computer vision, speech recognition, interaction with web sites.

Visual formatting

Apply the minimal visual formatting, to highlight priorities in the data.

You can choose one of the styles in the cards context menu.

But it's faster to do it with the help of shortcuts or using a special one-key markdown method.


The cards can contain images, video, audio and other documents. How to do it?

As usual everything starts with an empty card.

Enter the "@" symbol or use the shortcut CMD + / on MacOS or CTRL + / on Windows to unlock new control elements in the card.

You can create links to the desired files, located somewhere on your computer.

You can create copies of these files, and attach them to the current document.

You can make audio notes directly in ProtoText.

Please note: attaching assets and recording audio, require saving the current document.

Prototext saves files using unique names in the "assets" folder next to the current document.

Duplicating a file card means duplicating only the card, not the original file!

When you attach images or videos, double-clicking on the preview will open it in full screen, and this works both ways.

Deleting cards with files moves them to the "Trash" directory next to the current document.


This can be a link to a web resource.

For example, a YouTube video. To make this work... At first, insert the link address into an empty card, and then, convert it into an active link, using the context menu or keyboard shortcut CMD + /.

The app will automatically recognize the resource and create an interactive preview if it possible.

In other cases, the card will just have a button, for quick access to the default browser.

If the card text does not look like a web address – the app will take this text as "search tags", separated by commas. So, here you should approach the card text wisely, to use this helpful feature next time, in large documents, during search.

Next case – Links to files. You can create links to any local files on your device, including other ProtoText documents.

Final case – Internal links. It's a very useful and powerful feature to create knowledge networks, flexible nested presentations, index pages, complex AI prompts, and more.

How to create, transcribe, and structure audio notes?

Chapter 1.Preparation of the Working Document

1. Create a blank document in Prototext.

2. Save this document on your device.

This step is crucial for attaching future audio files to your document.

Chapter 2.Application Setup

1. After saving the document, open the application settings window.

2. In the settings, configure two essential tools: ChatGPT for text generation and Whisper for speech recognition.

3. Enable both ChatGPT and Whisper.

4. Customize the visual display of these tools according to your preference. You can use emoji icons if you like.

5. Specify the API keys required for ChatGPT. Hover the cursor over the question mark near the input field to find a link to get the API keys from OpenAI.

6. In the Chat settings, switch the prompt mode to the "duo" option.

7. Enable response splitting. In this way, each new paragraph of the ChatGPT response will be in a separate card.

8. Close the application settings by pressing the Escape key.

Chapter 3.Creating Audio Notes

1. Create a header for the current page by typing the hash symbol "#" inside a blank card and adding any text.

2. Press "Enter" at the end of the text to create a new card.

3. Type the at symbol "@" in the blank card to activate the mode for working with files and links.

Note that any special characters at the beginning of card text will perform their function and then disappear. This behavior is normal in Prototext.

4. You have two options for creating audio notes:a. Attach a ready-made audio file as an asset.b. Record a new audio directly in the app.

5. Choose an option according to your preference.

If you have choosed the recording option and you are a Mac user. Then the system will ask you once for permission to use the microphone. Allow this, save your current document, and restart Prototext. This will only need to be done once!

All files will be saved in the assets folder located near the current document on your device.

Chapter 4.Speech Recognition

1. Move the mouse cursor over the card containing the audio note.

2. Click the Whisper button located in the top right corner of the card.

3. Wait briefly for the service to convert the speech in the audio note into text.

Chapter 5.Preparing ChatGPT Prompts

1. Switch the editor to split-screen mode using the shortcuts [CMD+2] on MacOS or [CTRL+2] on Windows, or through the menu options.

2. Create a new page for ChatGPT prompts.

3. Set the view so that on the left side there is a page with audio notes, and on the right side there is the empty page.

4. Rename the empty page and create a card for ChatGPT prompt.

5. In this tutorial, we will use a special prompt that automatically structures our text, breaks it down into separate cards, and semantically marks them:

```Extract brief theses and questions from the given content and group them thematically using headings, title the note. The output format should be plain text lines starting with specific characters to indicate the type of statement or question. For example, use "#" for the title, "*" for a group headings, "!" for an important thesis, "-" for a negative thesis, "+" for a positive thesis, "?" for a question, and no marker for an ordinary thesis. Avoid using technical words (title, heading, thesis, question). The content is as follows:```

Chapter 6.Applaying ChatGPT Prompts

1. Make the first click on the ChatGPT button in the top right corner of the card with the note.

2. Make the second click on the card with the prompt. This action combines the two cards into a single request to ChatGPT.

3. Wait for a response from the service, which will be automatically split into separate cards based on the response splitting enabled earlier.

4. The task is complete! The response has been received, and the splitting is done.

If you are not satisfied with the result, modify the prompt accordingly.

To better understand how it works, study the diagram that is provided in the video.

Don't forget to save changes to your documents or enable auto-save in the main menu.

How to make a simple slideshow with synthesized voiceover comments, background music and render it as an MP4 video file?


ProtoText App version 1.10.2 or higher.

Tutorial files


Chapter 1.Preparation of the working document.

Open the app and start your work with an empty template. Save the document somewhere on your computer.

Prepare three pages in the saved document:1. The first one is for the slide show content.2. The second one is for output videos.3. The third is for additional files.

Chapter 2.Working with images.

Navigate to the first page (main).

Create an empty card and switch it to the linking mode, by entering the @ symbol, or using the context menu.

Click on the button "Add assets", to attach the desired images to the current document.

You can use images of any resolution, but its better if all of them match the dimensions of the final video. In this tutorial we are using 1920x1200 images.

Change the order of the images if necessary, by dragging the cards.

Try switching to "Two page view mode" [CMD+2] or [CTRL+2]. In this way, it is possible to scroll any two pages asynchronously, and drag cards between them. This works really well for big documents.

Chapter 3.Working with audio comments.

There are three ways, to create audio comments for slides:1. The first option is to attach pre-recorded audio files.2. The second option is to record audio comments directly in the app.3. The third option is, to generate speech from text using neural networks.

In this tutorial, we will consider the most exotic option – speech synthesis.

Let's voice this simple text as an example:```Image number one.Image number two.Image number three.Image number four.Thank you for watching!```

Open the app settings. Main menu > File > Settings.

Activate the text-to-speech service, input the API key, and choose the narrators voice.

This text-to-speech service is provided by ElevenLabs. Create an account if you don't already have one. 10K characters of narration for free each month!

Go back to the editor [ESC].

Hover the mouse cursor over a text card, and click on the speech synthesis icon.

Repeat these actions for all text cards.

Remove the original text cards.

Move each generated audio card, strictly under the relevant image card.

Chapter 4.Additional files.

Go to the Extras page.

Create an empty card, and switch it to the linking mode by entering the @ symbol, or using the context menu.

This time we will work with file links, instead of assets. For the slideshow, we need one background image of suitable resolution, and one musical track of suitable duration.

Link the two files located in the Extras folder:1. background-image.jpg2. Retreat - Jason Farnham - Short.wav

These links will be useful to us later.

Chapter 5.Explanation of the “Video Composing” concept.

Please open the tutorial video at 4:52 to view the diagram. This is the concept, of video composing, in the ProtoText app.

We always start by presenting the visual part. It can be a text slide, an attached image, or a video.

In the second part. We specify how to... voice over it.

Together... The visual and audio parts are called a fragment.

The final video, will be composed of the fragments.

Chapter 6Rendering Settings.

Go to the main page.

Switch the app to presentation mode. Main menu > View > Presentation mode or use shortcut [CMD+P] on MacOS, [CTRL+P] on Windows.

Note that the presentation will display the exact page and card that you were on before. The focus is preserved when switching views!

There are quite a few settings here. Let's figure out, what is what.

The "Content" section.

This section is responsible for determining which content will be used in the presentation.

All pages, or only the current one? We need only the current! Leave this parameter turned off.

Tag filtering, link exposing, and language localization are features for more complex documents.

For example, links help split up large scenarios, making it easier to manage content. Exposing links in the presentation, loads all the linked content.

But... This part isn't necessary for the tutorial. You can skip it.

The "Design" section.

Turn on the Slide mode. Otherwise, the content will be displayed in a continuous feed format. The video cannot be rendered as the feed.

The next two parameters "Fit images into the frame" and "Sharp image rendering" can also be skipped, because, they do not play a role in video rendering at the moment (v1.10.2).

The background of slides, text color, font sizes... Customize the design to your liking.

After making changes in the text inputs, press the Enter key to update the view.

In this tutorial, we are using a background image. The link that is located on the Extras page.

Copy the link to this file and use it as a background.

The "Rendering " section.

Choose the mp4 output format.

Change the video resolution according to your preference. In this tutorial we are using 1920x1200 (16:10) resolution.

Specify, how long each slide should be displayed, in seconds.

Copy the link of the music track, that is placed on the Extras page. If you need more tracks? Enter their links separated by commas.

Alleluia 😇 We have reached the coveted Render button! Please press it right now! And let us pray! That there are no bugs... 🙏

Check the rendering result.

In the case of success, the resulting video file will be published on the current page, at the very top, in a new card.

In case of an error, please copy the error message and send it to the app developer via Discord https://discord.gg/zze9qE5Cvq. It is also desirable to tell about your video project, or send the source files as well.

Press the Escape key to exit the presentation mode.

If you are not satisfied with the result, then delete the video card.

If everything is fine, then move the card to the successful rendering page.

Please save all changes made in the document. Main menu > File > Save or [CMD+S], [CTRL+S].


This was a long detailed tutorial about a simple slideshow – 11 cards, less than 2MB of data.

Now, for a moment, imagine that this entire video was made exactly the same way – almost 150 cards, 300 megabytes of source files, including screenshots, screen recordings, and synthesized speech.

The scenario was split into 11 parts. Each part was rendered separately, to make it simpler, to maintain the quality of the end product.

This information just proves that the potential of the feature is already strong, and it is enough to create long step by step tutorials.

ProtoText is a free app. It has its useful ideas in working with text, and now also in working with video.

Good luck to you in video production, with cards, and "AI" by the way!


Tested input data formats: JPEG and PNG images, MP4 and MOV videos, WAV and MP3 audio.

Vertically oriented content can also be rendered, but the application is not well-suited for it.

Rendering video is a resource-intensive task for your computer. Be prepared that rendering will use almost the maximum of your CPU, and large files require gigabytes of RAM.

Use cases

Simple slide shows with smooth transitions and background music.

Complex step-by-step tutorials with a lot of screenshots and screen recordings.

The task when you need to quickly combine a couple or more any video files with smooth transitions.

Combine an audio podcast from multiple recordings or synthesized speech.

Create a GIF quickly from a set of pictures.

Storytelling with voiceover.

A video portfolio for your clients.