Archivesinfo.maaganga.shopFebruary 10, 2026

February 2026
M	T	W	T	F	S	S
	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Archivesinfo.maaganga.shopFebruary 10, 2026 - info.maaganga.shop

Dennis Snell: HTML API: Check for unclosed attributes.

Today someone was discussing the goal of linting HTML, specifically of detecting unclosed attributes. Consider the following snippet:

<p class="important><img src="alert.png">This is important!</p>

It’s clear that a mistake led to a missing double-quote on the class attribute of the opening <p> tag. While WordPress’ HTML API doesn’t directly report this (because “unclosed attribute” isn’t particularly an HTML concept), it can be used to roughly detect it.

Here’s how to use the public functionality of the HTML API to detect unclosed attributes.

To do this, we have to define what an unclosed attribute means. For the sake of brevity we will assume that if an attribute value contains HTML-like syntax it is probably unclosed. We might be tempted to start with something like this:

<?php
foreach ( $processor->get_attribute_names_with_prefix( '' ) as $name ) {
  $value = $processor->get_attribute( $name );
  if ( ! is_string( $value ) ) {
    continue;
  }
  $checker = new WP_HTML_Tag_Processor( $value );
  if ( $checker->next_tag() ) {
    throw new WP_Error( 'Found tag syntax within attribute: is it unclosed?')
  }
}

This approach does get pretty far, but it suffers from the fact that it’s checking decoded attribute values, meaning it will detect false positives on any attribute which discusses tags, such as alt="the <img> tag is a void element". It’s better to review the raw attribute value instead of the decoded attribute value.

A sneaky trick hidden in attribute removal

The Tag Processor tracks attribute offsets but doesn’t expose them, even to subclasses. The HTML API tries really hard to avoid exposing string offsets! and it does this for good reason. String offsets are easy to misuse, are unclear, and finicky.

However, the Tag Processor does allow subclasses to access its lexical_updates, which is an array of string replacements to perform after semantic-level requests have been converted to text. We can analyze these updates after requesting to remove an attribute; that will return knowledge about all of the places where that attribute and any ignored duplicates appeared in the source document.

This approach also leans on the fact that static methods of subclasses have access to protected properties of the parent class.

This is risky code and should be used with extreme caution, code review, and shared understanding among those who will be asked to maintain it.

<?php
class WP_Attribute_Walker extends WP_HTML_Tag_Processor {
   public static function walk( $html ) {
      $p = new WP_HTML_Tag_Processor( $html );
      while ( $p->next_tag() ) {
         $names = $p->get_attribute_names_with_prefix( '' );
         foreach ( $names as $name ) {
            $p->remove_attribute( $name );
            $updates = $p->lexical_updates;
            $p->lexical_updates = array();
            $i = 0;
            foreach ( $updates as $update ) {
               $raw_attr = substr( $html, $update->start, $update->length );
               $quote_at = strcspn( $raw_attr, ''"' );
               $might_be_unclosed = false;
               if ( $quote_at < strlen( $raw_attr ) ) {
                  $raw_value = substr( $raw_attr, $quote_at + 1, strrpos( $raw_attr, $raw_attr[ $quote_at ] ) - $quote_at - 2 );
                  $checker   = new WP_HTML_Tag_Processor( $raw_value );
                  $might_be_unclosed = $checker->next_tag() || $checker->paused_at_incomplete_token();
               }
               yield $p->get_token_name() => array(
                  $name,
                  array( $update->start, $update->length ),
                  0 === $i++ ? 'non-duplicate' : 'duplicate',
                  $might_be_unclosed ? 'contains-tag-like-content' : 'does-not-contain-tag-like-content',
                  substr( $html, $update->start, $update->length ),
               );
            }
         }
      }
   }
}

This WP_Attribute_Walker::walk( $html ) method steps through each tag in the given document and returns a generator which reports each attribute on the tag, as well as some meta information about it.

$meta === array(
    'class',                       // parsed name of attribute
    array( 3, 27 ),                // (offset, length) of full attribute span in HTML
    'non-duplicate',               // whether this is the actual attribute or an ignored duplicate
    'contains-tag-like-content',   // likelihood of being unclosed
    'class="important><img src="', // full span of attribute in HTML
);

How to use this walker

<?php
$html = '<p class="important><img src="alert.png">This is important!</p>';
foreach ( WP_Attribute_Walker::walk( $html ) as $tag_name => $meta ) {
    echo "Found in <{$tag_name}> an attribute named '{$meta[0]}'n";
    echo "  @ byte offset {$meta[1][0]} extending {$meta[1][1]} bytesn";
    echo "  it is a {$meta[2]} attribute on the tagn";
    echo "  its value {$meta[3]}n";
    echo "     `{$meta[4]}`";
}

The output here tells us what we want to know:

Found in <P> an attribute named 'class'
  @ byte offset 3 extending 27 bytes
  it is a non-duplicate attribute on the tag
  its value contains-tag-like-content
     `class="important><img src="`
Found in <P> an attribute named 'alert.png"'
  @ byte offset 30 extending 10 bytes
  it is a non-duplicate attribute on the tag
  its value does-not-contain-tag-like-content
     `alert.png"`

For normative HTML the values are not as surprising. In this case, the missing " has been added to the class attribute.

$html = '<p class="important"><img src="alert.png">This is important!</p>';

Found in <P> an attribute named 'class'
  @ byte offset 3 extending 17 bytes
  it is a non-duplicate attribute on the tag
  its value does-not-contain-tag-like-content
     `class="important"`
Found in <IMG> an attribute named 'src'
  @ byte offset 26 extending 15 bytes
  it is a non-duplicate attribute on the tag
  its value does-not-contain-tag-like-content
     `src="alert.png"`

Summary

This code is not meant to be normative; it’s probably missing important details. It’s here to demonstrate one way we can take advantage of the already-available aspects of the HTML API to perform more interesting work.

In this case, we can tug at some of its internals to build linting and reporting tools which investigate aspects not exposed in the public interface: duplicate attributes and raw attribute values.

For the use-case of checking whether an attribute is closed or not, it’s a tricky problem to solve. We can only truly resolve this with a set of heuristics to determine the likelihood that an attribute isn’t closed, because HTML parsers will universally interpret any given string in a specific way, and regardless of errors, will produce tags and attributes from it.

Before we reach for custom regular expressions (PCRE), we can look into the HTML API and consider the sliding scale of safety it presents to us; we can take advantage of the parsing it’s already performing to remove the need to replicate all of HTML’s complicated parsing rules in our custom code.

By rmshekhar@gmail.com on February 10, 2026 | Uncategorized | A comment?

Open Channels FM: Hey, What Do You Think About the Internet (and What We Might Have Lost)?

Nathan and Bob reminisce about the “good ol’ days” pre-internet, lamenting over lost patience and tactile experiences while praising modern conveniences. It’s a nostalgic roast of technology’s double-edged sword.

By rmshekhar@gmail.com on | Uncategorized | A comment?

Matt: Leadership at the Peak

I want to start by thanking the Automattic board, and in particular General (Ret.) Ann Dunwoody, for encouraging me to step away from the endless work of being CEO of Automattic to focus on training and development. Ann, as one of this generation’s great leaders, did it herself before recommending it. She took the course shortly after becoming a four-star.

The course was Leadership at the Peak from the Center for Creative Leadership, a nonprofit founded in 1970 by the family that invented Vicks VapoRub.

As I reflect on all the corporate training I’ve had, from the first class they made me take at CNET 22 years ago because my title had manager in it, to the workshops or intensive CEO things I’ve been lucky enough to be exposed to later, there’s one thing that really stands clear: You get out of any program what you put into it.

If you come in skeptical, distracted, or resentful, even if golden information is being dropped, it will bounce off you like water on a duck. You have to put yourself in a state of mind of extreme openness and enthusiasm, and take an earnest try at what the facilitators have designed and planned, no matter how cheesy, corny, obvious, or silly it might seem. Remember, their intention is for you to get something out of this, and they’ve done it before.

Holding that state of openness is also a catalyst for the teacher; they light up when students are willing to trust the process, and they’ll give you their very best. I originally titled this post “Complete Surrender” because that extreme statement helps me step out of the part of my mind that is always trying to challenge authority, remix conventions, or think I’m cleverer than others.

These programs are usually expensive, not just in dollars but in time you have to clear from other commitments, so don’t squander it by staying in your default modes of checking work, news, etc. Create a space for yourself to reflect, learn, and grow. It’s rare and precious.

The caveat, of course, is to choose your teachers well. CCL has been doing this since the 70s; they’ve figured a few things out. They’re Lindy. All of these programs change and evolve over time; they’re not carved in stone, but it’s particularly interesting to see what survives when something has been going on for a long time.

I’m also not religious about these things. I think of them as mental models that are new arrows in your quiver. You can use them as is, or, even better, mix them with something else you’ve learned to create something more useful and personalized to your context. The more you have, the more sturdy your latticework of understanding is, and the more robust your information framework will be when you encounter something novel.

There’s also some luck in the group; a bad apple can throw off the week for everyone. My cohort had people from a variety of industries like healthcare, paper products, car rentals, and business process outsourcing from all around the world, including Egypt, Brazil, Saudi Arabia, and all across the US. It would have been easy for people to be guarded, but everyone really leaned in. I think we had so little network overlap that people felt more comfortable opening up. And, of course, it was endlessly fascinating to learn about the challenges across vastly different industries, as well as the universal commonalities that arise whenever you try to vector a group of humans towards a common goal.

One of the inspirations I drew from Ann’s book, A Higher Standard, was the extent to which the Army invests in training and development, sometimes sending people to programs for years before they move into a new role. They’re always thinking about the next generation.

A big theme for me in 2026 is learning: Last month at Automattic, we did our first two-week in-person AI Enablement intensive at our Noho Space, and the feedback was incredible. On the WordPress side, this year we’ll have thousands of college students enroll in our new WordPress Credits program to earn credits toward their degrees. The number of cities where WP meetups are held is on track to double; it’s clear people are hungry for opportunities to learn and grow.

People have been asking my takeaways from the course, and it’s been hard to summarize, but I came away with big lessons on how my comfortable and improvisational presentation style can come off as not having a solid plan or being prepared, the importance of exercise and nutrition to have the energy you need as a leader, and the importance of being on time and what that signals to others. Great feedback is a gift and a mirror, allowing you to see things you might miss about how you show up to others. In the course, we made plans, and since then, I’ve been experimenting with integrating these learnings and others into my day-to-day. I feel like it’s really had an impact.

So in closing, when you’re a busy executive, there’s never a good time to step away for a week, but I highly encourage every leader to at least once a year invest in themselves and let your colleagues and loved ones know that for a few days you’ll be really focused on a departure from your quotidian day-to-day and work on growth. It’s hard but worth it.

By rmshekhar@gmail.com on | Uncategorized | A comment?