LDC #78: A Snapshot of the Clipboard

Today we’re going to talk about one of the most commonly used but least commonly understood features of your computer: the clipboard. Everyone who regularly uses a computer regularly knows how much of a time-saver using copy and paste can be, and the value only goes up the more you use it. As a programmer, the amount of time that I save by moving code around can’t even be measured, and I’m sure I’m not alone when I say that. However, the inner workings of the clipboard are often times misunderstood. Today I will explain what the clipboard is and how to interact with it using Legato.

The clipboard is a global object that is used to transfer data around between programs and windows. When you cut or copy data, you are placing that data into the global clipboard object. What is hidden underneath the surface, however, is that a lot more data than you might expect. That data is also being stored in many different formats. It is up to the program into which you pasting your information to decipher the data that is on the clipboard and decide in which formats, if any, it can use.

Before we dive into examples of what gets placed onto the clipboard, we need to first talk about the different kinds of formats that exist. Clipboard formats fall into two categories: conventional and de facto. No matter the category, a format will start with a “cf_” followed by the name of the format. Conventional formats are defined by Windows and are referenced in all caps. These are the formats that are specifically defined in the Windows SDK. Every program built for Windows should place things into conventional formats. These formats include CF_TEXT and CF_UNICODETEXT. De facto formats are defined by an application. They will have a name if given one. Otherwise they will be named “cf_” followed by the registered token in hexadecimal. Registered format types will reset each time the operating system restarts, which means that these types are likely not going to be static from computer to computer. More information, including a list of standard formats, can be found in the Legato Documentation.

Now let’s take a look at the data that gets added to the clipboard in two quick examples. These examples will use the output from our script below. First let’s examine what happens in GoFiler when you copy code from the Legato IDE:

00: Format Code: CF_TEXT         Format Name: ANSI Text            Description: ANSI Text            Size: 259
01: Format Code: cf_control      Format Name: NWS Control          Description: Novaworks Application Size: 126
02: Format Code: CF_LOCALE       Format Name: Locale (language)    Description: Locale (language)    Size: 004
03: Format Code: CF_OEMTEXT      Format Name: OEM Text             Description: OEM Text             Size: 259
04: Format Code: CF_UNICODETEXT  Format Name: Unicode Text         Description: Unicode Text         Size: 518

GoFiler adds two items to the clipboard: ANSI text and control data. Windows then adds three more items to the clipboard: OEM text, Unicode text, and Locale information. Windows does this with all textual data so that OEM, ANSI, and Unicode formats are always available together.

Next we’re going to change things up by looking at a much different example, Microsoft Word. At first you may wonder how different could the clipboard be when they’re both text.

00: Format Code: cf_data_object  Format Name: DataObject           Description: Data Object          Size: 004
01: Format Code: cf_object_desc  Format Name: Object Descriptor    Description: Object Descriptor    Size: 140
02: Format Code: cf_rtf          Format Name: Rich Text Format     Description: Rich Text Format (RTF) Size: 41822
03: Format Code: cf_html         Format Name: HTML Format          Description: HTML Structured Data Size: 39057
04: Format Code: CF_TEXT         Format Name: ANSI Text            Description: ANSI Text            Size: 033
05: Format Code: CF_UNICODETEXT  Format Name: Unicode Text         Description: Unicode Text         Size: 066
06: Format Code: CF_ENHMETAFILE  Format Name: Enhanced Meta File (image) Description: Enhanced Meta File (image) Size: 000
07: Format Code: CF_METAFILEPICT Format Name: Meta File (image)    Description: Meta File (image)    Size: 016
08: Format Code: cf_embed_src    Format Name: Embed Source         Description: Embedded Source      Size: 13271
09: Format Code: cf_native       Format Name: Native               Description: Application Native   Size: 13270
10: Format Code: cf_owner_link   Format Name: OwnerLink            Description: Ownerlink            Size: 039
11: Format Code: cf_link_src     Format Name: Link Source          Description: Link Source          Size: 132
12: Format Code: cf_link_src_desc Format Name: Link Source Descriptor Description: Link Source Descriptor Size: 140
13: Format Code: cf_object_link  Format Name: ObjectLink           Description: Object Link          Size: 037
14: Format Code: cf_0000c355     Format Name: HyperlinkWordBkmk    Description: Not Known            Size: 040
15: Format Code: cf_ole          Format Name: Ole Private Data     Description: OLE Data             Size: 440
16: Format Code: CF_LOCALE       Format Name: Locale (language)    Description: Locale (language)    Size: 004
17: Format Code: CF_OEMTEXT      Format Name: OEM Text             Description: OEM Text             Size: 033

It turns out that when you copy text from Word you end up with 18 different formats on the clipboard! How could that be? Well, let’s dig in a little bit and see. The first five are easy to explain; they’re actually the same five formats as get put on the clipboard when copying Legato code. In addition there are two images on the clipboard, a Meta File and an Enhanced Meta File. There is then the data stored as HTML and RTF. Finally, a bunch of underlying meta data comprises the rest of the formats. All of these meta data formats are stored in the de facto category we discussed earlier, and presumably they are used by Word (and other Microsoft programs) when you paste elsewhere.

So now that we understand more about what the clipboard is, let’s take a look at how we can edit it. Legato gives you full control over the clipboard object, including reading from it and writing to it. In order to read off of the clipboard, we first have to get the handle to the clipboard. There are two different ways of doing this: the ClipboardCreate and the ClipboardOpen functions. Both return a handle to the clipboard. However, the ClipboardCreate function will clear the clipboard while the ClipboardOpen function will not. Whether you are looking at editing the information on the clipboard or just putting new information on the clipboard will influence which function you choose.

The next four sections can be broken down into: Reading Format Data, Checking Data, Getting Data, and Setting Data.

There are a number of functions that relate to retrieving clipboard format data:
string   = ClipboardGetApplication ( [handle hClipboard] );
dword    = ClipboardGetFormatCode ( [handle hClipboard], string name );
string   = ClipboardGetFormatDescription ([handle hClipboard], dword format | string code);
string   = ClipboardGetFormatName ( [handle hClipboard], dword format | string code );
int      = ClipboardGetFormatSize ( [handle hClipboard], dword format | string code );
string[] = ClipboardGetFormats ( [handle hClipboard] );
Using these functions we can retrieve what is essentially meta data about the data that is currently stored in the clipboard. We can use these in conjunction with our Checking Data functions (below) to get a clear picture as to what we can do with the data on the clipboard:
boolean = ClipboardIsCSVAvailable ( );
boolean = ClipboardIsDIBAvailable ( );
boolean = ClipboardIsGIFAvailable ( );
boolean = ClipboardIsHTMLAvailable ( );
boolean = ClipboardIsImageAvailable ( );
boolean = ClipboardIsJPGAvailable ( );
boolean = ClipboardIsPNGAvailable ( );
boolean = ClipboardIsRTFAvailable ( );
boolean = ClipboardIsTextAvailable ( );
boolean = ClipboardIsUnicodeAvailable ( );
Once we have a clear picture of what is on the clipboard, we can retrieve any of the data off of the clipboard using the Retrieve Data functions:
handle     = ClipboardGetData ( [handle hClipboard], dword format | string code );
string[][] = ClipboardGetCSVData ( [handle hClipboard] );
string     = ClipboardGetCSVText ( [handle hClipboard] );
handle     = ClipboardGetDIB ( [handle hClipboard] );
handle     = ClipboardGetGIF ( [handle hClipboard] );
string     = ClipboardGetHTML ( [handle hClipboard], [int mode] );
string[]   = ClipboardGetHTMLComponents ( string data );
handle     = ClipboardGetJPG ( [handle hClipboard] );
handle     = ClipboardGetPNG ( [handle hClipboard] );
string     = ClipboardGetRTF ( [handle hClipboard] );
string     = ClipboardGetText ( [handle hClipboard], [boolean utf] );
string     = ClipboardGetUnicode ( [handle hClipboard] );
Finally we have the functions that we can use to change the data on the clipboard, our Setting Data functions:
int = ClipboardSetCSV ( handle hClipboard, string data | string [][] data | handle hPool );
int = ClipboardSetHTML ( handle hClipboard, string data | handle hPool, [boolean raw] );
int = ClipboardSetHTML ( handle hClipboard, string data | handle hPool, [string header],                              [string footer] );
int = ClipboardSetRTF ( handle hClipboard, string data | handle hPool );
int = ClipboardSetText ( handle hClipboard, string data | handle hPool );
int = ClipboardSetUnicode ( handle hClipboard, wstring data | handle hPool );
Let’s take a quick look at a couple quick examples of these functions in action. The first example is one I wrote to show as many of these clipboard operations as possible. Here’s the code:

void setup() {

      MenuSetHook("EDIT_PASTE",GetScriptFilename(),"check_copy");       
      
}      
                                                                        
void main() {                                                           
                                                                        
    setup();
    }
    
int check_copy(int f_id, string mode) {

    handle              hBoard;
    string              sClip;
    string              list[];
    int                 ix;
    int                 size;
    string              s1;
    handle              hLog;
    handle              hData;

    if (mode != "preprocess") {
      return;
      }

    hBoard = ClipboardOpen();
    sClip = ClipboardGetApplication(hBoard);
    
    hLog = LogCreate("Clipboard Data");
    list = ClipboardGetFormats(hBoard);
    size = ArrayGetAxisDepth(list);
    ix = 0;

    s1 = "Clipboard Formats";
    AddMessage(hLog, s1);
    LogIndent(hLog);
    
    while (ix < size) {
      s1 = FormatString("%02d: Format Code: %-15s Format Name: %-20s Description: %-20s Size: %03d", 
                           ix, ArrayGetKeyName(list, ix), list[ix], 
                           ClipboardGetFormatDescription(hBoard, ArrayGetKeyName(list, ix)), 
                           ClipboardGetFormatSize(hBoard, ArrayGetKeyName(list, ix)));
      AddMessage(hLog, s1);
      ix++;
      }

    LogOutdent(hLog);

    s1 = "Clipboard Data";
    AddMessage(hLog, s1);
    LogIndent(hLog);
    
    if (ClipboardIsCSVAvailable()) {
      s1 = "CSV";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetCSVText(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    if (ClipboardIsHTMLAvailable()) {
      s1 = "HTML";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetHTML(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    
    if (ClipboardIsRTFAvailable()) {
      s1 = "RTF";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetRTF(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    
    if (ClipboardIsTextAvailable()) {
      s1 = "Text";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetText(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    
    if (ClipboardIsUnicodeAvailable()) {
      s1 = "Unicode";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(UnicodeToAnsi(ClipboardGetUnicode(hBoard)), 0, 500));
      LogOutdent(hLog);
      }

    LogDisplay(hLog, "Clipboard Data");

    CloseHandle(hBoard); 
    return ERROR_NONE; 
    }

We can break this down pretty easily into the sections that I laid out up above, and we’ll explore that shortly. First, though, this is a script that we’re hooking into the Paste function, so anytime that a user pastes after this function has been called, this script will run. Because this is an example, I didn’t add a way to unhook the function, so you will have to restart GoFiler when you’re done. The first portion of the script is creating the hook:

void setup() {

      MenuSetHook("EDIT_PASTE",GetScriptFilename(),"check_copy");       
      
}      
                                                                        
void main() {                                                           
                                                                        
    setup();
    }
    
int check_copy(int f_id, string mode) {

    handle              hBoard;
    string              sClip;
    string              list[];
    int                 ix;
    int                 size;
    string              s1;
    handle              hLog;
    handle              hData;

    if (mode != "preprocess") {
      return;
      } 
    hBoard = ClipboardOpen();
    sClip = ClipboardGetApplication(hBoard);
    
    hLog = LogCreate("Clipboard Data");
    list = ClipboardGetFormats(hBoard);
    size = ArrayGetAxisDepth(list);
    ix = 0;

    s1 = "Clipboard Formats";
    AddMessage(hLog, s1);
    LogIndent(hLog);

After creating the hook we get the current clipboard. We don’t want to hold the handle forever, so we get the current clipboard each time that the hook is run, and we’ll close that handle before we leave the function. This allows other programs to access the clipboard. We then get the application that put the data on to the clipboard. Next we set up to report all of the data that we can about the clipboard to the user. We create a Log object, get the formats off of the clipboard, and obtain the depth of the array so we have the information needed to run our loop. Next we put a beginning into the log and prepare the log for more information by indenting it.

    while (ix < size) {
      s1 = FormatString("%02d: Format Code: %-15s Format Name: %-20s Description: %-20s Size: %03d", 
                     ix, ArrayGetKeyName(list, ix), list[ix], ClipboardGetFormatDescription(hBoard, ArrayGetKeyName(list, ix)), 
                     ClipboardGetFormatSize(hBoard, ArrayGetKeyName(list, ix)));
      AddMessage(hLog, s1);
      ix++;
      }

    LogOutdent(hLog);

We start a loop to go through all of the formats on the clipboard. For each format, we gather the number, Format Code, Format Name, Format Description, and size. Then we put that information into the log. By the end of this, our log will look something like this, depending on where the data on the clipboard and from where it came:

Clipboard Formats
     00: Format Code: CF_TEXT         Format Name: ANSI Text            Description: ANSI Text            Size: 415
     01: Format Code: cf_control      Format Name: NWS Control          Description: Novaworks Application Size: 126
     02: Format Code: CF_LOCALE       Format Name: Locale (language)    Description: Locale (language)    Size: 004
     03: Format Code: CF_OEMTEXT      Format Name: OEM Text             Description: OEM Text             Size: 415
     04: Format Code: CF_UNICODETEXT  Format Name: Unicode Text         Description: Unicode Text         Size: 830

Let’s take a look at the next block of code:

    s1 = "Clipboard Data";
    AddMessage(hLog, s1);
    LogIndent(hLog);
    
    if (ClipboardIsCSVAvailable()) {
      s1 = "CSV";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetCSVText(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    if (ClipboardIsHTMLAvailable()) {
      s1 = "HTML";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetHTML(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    
    if (ClipboardIsRTFAvailable()) {
      s1 = "RTF";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetRTF(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    
    if (ClipboardIsTextAvailable()) {
      s1 = "Text";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(ClipboardGetText(hBoard), 0, 500));
      LogOutdent(hLog);
      }
    
    if (ClipboardIsUnicodeAvailable()) {
      s1 = "Unicode";
      AddMessage(hLog, s1);
      LogIndent(hLog);
      AddMessage(hLog, "%s", GetStringSegment(UnicodeToAnsi(ClipboardGetUnicode(hBoard)), 0, 500));
      LogOutdent(hLog);
      }

    LogDisplay(hLog, "Clipboard Data");

Here we ask the clipboard if data is available, and if it is, the function adds the data to the log along with what kind of data it is. We go through all of the textual data and not the image data because GoFiler’s logs do not support having images added to them. Most of these checks look the same, except you’ll notice there is an extra step using the UnicodeToAnsi function on our Unicode check. This is because Unicode support is currently very limited in GoFiler, and the logs do not yet support having Unicode added to them. Rather than cause errors, we do a quick conversion to show the user that there is Unicode data available on the clipboard. Our last step is to tell GoFiler to display the log in the Information View.

    CloseHandle(hBoard);
    return ERROR_NONE;
}

Finally we do some very quick cleanup work by closing the handle to the clipboard and returning without an error. There are no other possible return statements in this function because we don’t care about stopping the paste function in this script; we only care about gaining information about what is on the clipboard.

Now that we have looked through all of the different options with the clipboard, I’ll walk you through another script I wrote, this time for a practical security application. Some organizations have extremely strict rules about what you can do on their network computers. Of these, some even restrict what you can do with the clipboard. So I wrote a quick script helping an organization secure the data that they put into GoFiler by restricting their paste functions to only allow pasting data coming from known programs (Microsoft Office, Notepad, Google Chrome, Mozilla Firefox, Adobe Acrobat, GoFiler, and a few others). To do this, I rely upon the information we get from the ClipboardGetApplication function. This function returns a string that includes the name of the application if the application is in the known list but otherwise returns “Unknown Source (program.exe)” if the application is not the known list. Let’s go through the script:

void setup() {

      MenuSetHook("EDIT_PASTE",GetScriptFilename(),"check_copy");       
      
}      
                                                                        
void main() {                                                           
                                                                        
    setup();
    }
    
int check_copy(int f_id, string mode) {

    handle              hBoard;
    string              sClip;

    if (mode != "preprocess") {
      return;
      }

    hBoard = ClipboardOpen();
    sClip = ClipboardGetApplication(hBoard);
    
    if (IsInString(sClip, "Unknown Source")) {
        MessageBox("You cannot paste from an unauthorized application.\r\n%s", sClip);
        CloseHandle(hBoard);
        return ERROR_EXIT;
        }

    CloseHandle(hBoard);
    return ERROR_NONE;
}

Wow, doesn’t that look familiar? Sure enough, I used the same base as the previous script but tweaked the ending. We get the application information from the clipboard and this time, rather than just printing it out, we check to see if the clipboard data is from an unknown source. If it is, we inform the user that they cannot paste and return an ERROR_EXIT code, stopping GoFiler from finishing the paste function.

Hopefully I’ve been able to give you a new appreciation for all the work that goes on behind the scenes while you are using your computer, even when doing something as simple as copying and pasting information. We have talked about what information is stored on the clipboard and how that information is represented in the clipboard object. We have also talked about all the different functions that you can use in Legato to read and change the information on the clipboard. After getting through all of this, I have one more important piece of advice: if you are going to change the data on the clipboard, make sure that it is obvious to the user what you are doing. The last thing you want to do is leave the user feeling frustrated and confused by the data on their clipboard changing unexpectedly.

Now, my friends, go forth and clip!
 

Joshua Kwiatkowski is a developer at Novaworks, primarily working on Novaworks’ cloud-based solution, GoFiler Online. He is a graduate of the Rochester Institute of Technology with a Bachelor of Science degree in Game Design and Development. He has been with the company since 2013.