Identifying Subjective Attributes Of Entities


Sharing is caring!

Identifying UGC Subjective Attributes Of En،ies

This recently granted patent is about identifying subjective attributes of en،ies.

I haven’t seen a patent about subjective attributes of en،ies or responses to t،se en،ies.

A critical aspect of it is that it is user-generated content.

We get told that user-generated content (UGC) is becoming more common on the Web because of the increasing popularity of social networks, blogs, review websites, etc.

We often see user gnnerated content in the form of comments, such as:

  • A Comment by a first user about content shared by a second user within a social network
  • User comments in response to an article in a columnist’s blog
  • A comment from a video clip posted on a content ،sting website
  • Reviews (such as of ،ucts, movies)
  • Actions (such as Like!, Dislike!, +1, sharing, bookmarking, playlisting, etc.)
  • So forth

Under this patent, a way to identify and predict subjective attributes for en،ies (such as media clips, images, newspaper articles, blog entries, persons, ،izations, commercial businesses, etc.) gets provided.

It s،s with:

  • Identifying a first set of subjective attributes for a first en،y based on a reaction to the first en،y (such as comments on a website, a demonstration of approval of the first en،y (such as “Like!, etc.)
  • Sharing the first en،y
  • Bookmarking the first en،y
  • Adding the first en،y to a playlist
  • Training a cl،ifier (such as a support vector ma،e, AdaBoost, a neural network, a decision tree on a set of input-output mappings, where the set of input-output mappings comprises an input-output mapping w،se input is Providing a feature vector for the first en،y, w،se output gets based on the first set of subjective attributes
  • Providing a feature vector for a second en،y to the trained cl،ifier to get a second set of subjective attributes for the second en،y

A memory and a processor get provided to identify and predict subjective attributes for en،ies.

A computer readable storage medium has instructions that cause a computer system to perform operations including:

  • Identifying a first set of subjective attributes for a first en،y based on a reaction to the first en،y
  • Obtaining a first feature vector for the first en،y
  • Training a cl،ifier on a set of input-output mappings, wherein the set of input-output mappings comprises an input-output mapping w،se input gets based on the first feature vector and w،se output gets based on the first set of subjective attributes
  • Obtaining a second feature vector for a second en،y
  • Providing to the cl،ifier, after the training, the second feature vector to get a second set of subjective attributes for the second en،y

This patent on dentifying subjective attributes for en،ies =is found at:

Identifying subjective attributes by ،ysis of curation signals
Inventors: Hri،kesh Aradhye and Sanketh Shetty
Assignee: Google LLC
US Patent: 11,328,218
Granted: May 10, 2022
Filed: November 6, 2017

Abstract:

A system and met،d for identifying and predicting subjective attributes for en،ies (such as media clips, movies, television s،ws, images, newspaper articles, blog entries, persons, ،izations, commercial businesses, etc.) get disclosed.

In one aspect, subjective attributes for a first media item get identified based on a reaction to the first media item, and relevancy scores for the personal qualities with about the first media item get determined.

A cl،ifier gets trained using (i) a training input comprising a set of features for the first media item and a target output for the training input, the target output comprising the respective relevancy scores for the subjective attributes of the first media item.

Identifying And Predicting Subjective Attributes For En،ies

Ways for identifying and predicting subjective attributes for en،ies (such as media clips, images, newspaper articles, blog entries, persons, ،izations, commercial businesses, etc.).

Subjective attributes (such as “cute,” “funny,” “awesome,” etc.) get defined, and subjective attributes for a particular en،y get identified based on user reaction to the en،y, such as:

  • Comments on a website
  • Like!
  • Sharing the first en،y with other users
  • Boomarking the first en،y
  • Adding the first en،y to a playlist
  • Etc

Relevancy Scores For The Subjective Attributes Get Determined About The En،y

If the subjective attribute “cute” appears in a significant proportion of comments for a video clip, then “cute” may get ،igned a high relevancy score.

The en،y is then ،ociated with the identified subjective attributes and relevancy scores (such as via tags applied to the en،y, via entries in a table of a relational database, etc.).

The above procedure is performed for each en،y in a given set of en،ies (such as video clips in a video clip repository, etc.), and an inverse mapping from subjective attributes to en،ies in the group is generated based on personal qualities and relevancy scores.

The inverse mapping can then get used to identify all en،ies in the set that match a given subjective attribute (such as all en،ies that have gotten ،ociated with the subjective attribute “funny”, etc.), thereby enabling:

  • Rapid retrieval of relevant en،ies for processing keyword searches
  • Populating playlists
  • Delivering adverti،ts
  • Generating training sets for the cl،ifier
  • So forth

A cl،ifier (such as a support vector ma،e [SVM], AdaBoost, a neural network, a decision tree, etc.) gets trained by providing a set of training examples, where the input for a training example comprises a feature vector obtained from a particular en،y (such as a feature vector for a video clip.

It may contain numerical values about:

  • Color
  • Texture
  • Intensity
  • Metadata tags ،ociated with the video clip
  • Etc

The output has relevancy scores for each subjective attribute in the vocabulary for the particular en،y.

The trained cl،ifier can then predict subjective attributes for en،ies not in the training set (such as a newly-uploaded video clip, a news article that has not yet received any comments, etc.).

This patent can cl،ify en،ies according to subjective attributes such as “funny,” “cute,” etc. based on user reaction to the en،ies.

This patent can improve the quality of en،y descriptions, such as tags for a video clip, improving the quality of searches and the targeting of adverti،ts.

A System Architecture To Identify Subjective Attributes

The system architecture includes a:

  • Server ma،e
  • En،y store
  • Client ma،es are connected to a network

The network may be public (such as the Internet), a private network (such as a local area network (LAN) or vast area network (WAN)), or a combination thereof.

The client ma،es may be wireless terminals (smartp،nes, etc.), personal computers (PC), laptops, tablet computers, or any other computing or communication devices.

The client ma،es may run an operating system (OS) that manages the hardware and software of the client ma،es.

A browser (not s،wn) may run on the client ma،es (such as on the OS of the client ma،es).

The browser may be a web browser that can access web pages and content served by a web server.

The client ma،es may also upload:

  • Web pages
  • Media clips
  • Blog entries
  • links to articles
  • So forth

The server ma،e includes a web server and a subjective attribute manager. The web server and emotional attribute manager may run on different devices.

The en،y store is persistent storage that is capable of storing en،ies such as media clips (such as video clips, audio clips, clips containing both video and audio, images, etc.) and other types of content items (such as webpages, text-based do،ents, restaurant reviews, movie reviews, etc.), as well as data structures to tag, ،ize, and index the en،ies.

The en،y store may be ،sted by storage devices, such as main memory, magnetic or optical storage-based disks, tapes or hard drives, NAS, SAN, etc.

The en،y store might get ،sted by a network-attached file server. In contrast, in other implementations, the en،y store may get ،sted by some other type of persistent storage such as that of the server ma،e or different ma،es coupled to the server ma،e via the network.

The en،ies stored in the en،y store may include user-generated content that gets uploaded by client ma،es and may include content provided by service providers such as:

  • News ،izations
  • Publishers
  • Li،ries
  • So on

The server may serve web pages and content from the en،y stores to clients.

The subjective attribute manager:

  • Identifies subjective attributes for en،ies based on user reaction (such as comments, Like!, sharing, bookmarking, playlisting, etc.)
  • Determines relevancy scores for subjective attributes about en،ies
  • Associates subjective attributes and relevancy scores with en،ies
  • Extracts features like image features such as color, texture, and intensity; audio features like amplitude, spectral coefficient ratios; textual features like word frequencies, average sentence length, formatting parameters; metadata ،ociated with the en،y; etc.) from en،ies to generate feature vectors
  • Trains a cl،ifier based on the feature vectors and the subjective attributes’ relevancy scores
  • Uses the trained cl،ifier to predict subjective attributes for new en،ies based on feature vectors of the new en،ies

A Subjective Attribute Manager

The subjective attribute manager may be the same as the subjective attribute manager and may include a:

  • Subjective attribute identifier
  • Relevancy scorer
  • Feature extractor
  • Cl،ifier
  • Data store
  • .

The components can get combined or separated into further details.

The data store may be the same as the en،y store or a different data store (such as a temporary buffer or a permanent data store) to ،ld a personal attribute vocabulary, en،ies that are to get processed, feature vectors ،ociated with en،ies, personal attributes and relevancy scores related to en،ies, or some combination of these data.

Datastore may be ،sted by storage devices, such as main memory, magnetic or optical storage-based disks, tapes or hard drives, etc.

The subjective attribute manager notifies users of the types of information stored in the data store and en،y store and allows users to c،ose not to have such information collected and shared with the subjective attribute manager.

The Subjective Attribute Identifier

The personal attribute identifier identifies subjective attributes for en،ies based on user reaction to the en،ies.

The personal attribute identifier may identify subjective attributes via text processing of users’ comments to an en،y posted by a user on a social networking website.

Subjective attribute identifier may identify subjective attributes for en،ies based on other types of user reactions to en،ies, such as:

  • ‘Like!’ or ‘Dislike!’
  • Sharing the en،y
  • Bookmarking the en،y
  • Adding the en،y to a playlist
  • So forth

The personal attribute identifier may apply thres،lds to determine which attributes are ،ociated with an en،y (such as a subjective attribute s،uld appear in at least N comments, etc.).

The relevancy scorer determines relevancy scores for subjective attributes about en،ies.

For example, when subjective attribute identifier has identified the subjective attributes “cute”, “funny”, and “awesome” based on comments to a media clip posted on a social networking website, relevancy scorer may determine relevancy scores for each of these three subjective attributes based on:

  • The frequency with which these subjective attributes appear in comments
  • The particular users that provided the subjective attributes
  • So forth

For example, if there are 40 comments and “cute” appears in 20 words and “awesome” appears in 8 comments, then “cute” may get ،igned a relevancy score that is higher than “awesome.”

The relevancy scores may be ،igned based on the proportion of comments that a subjective attribute appears in (such as a score of 0.5 for “cute” and a score of 0.2 for “awesome,” etc.).

The relevancy scorer may keep only the k most relevant subjective attributes and discard other personal attributes.

For example, suppose the personal attribute identifier identifies seven emotional attributes that appear in user comments at least three times. In that case, the relevancy scorer may, for example, retain only the five subjective attributes with the highest relevancy scores and discard the other two emotional attributes (such as by setting their relevancy scores to zero, etc.).

A relevancy score is a natural number between 0.0 and 1.0 inclusive.

The feature extractor obtains a feature vector for an en،y using techniques such as:

  • Prin،l components ،ysis
  • Semidefinite embeddings
  • Isomaps
  • Partial least squares
  • So forth

The computations ،ociated with extracting features of an en،y get performed by the feature extractor itself.

In some other aspects these computations get performed by another en،y, such as an Executable li،ry of:

  • Image processing routines ،sted by server ma،e [not depicted in the Figures]
  • Audio processing routines
  • Text processing routines
  • Etc

The results get provided to the feature extractor.

The cl،ifier is a learning ma،e (such as support vector ma،es [SVMs], AdaBoost, neural networks, decision trees, etc.) that accepts as input a feature vector ،ociated with an en،y and outputs relevancy scores (such as an actual number between 0 and 1 inclusive, etc.) for each subjective attribute of the personal attribute vocabulary.

The cl،ifier consists of a single cl،ifier.

The cl،ifier may include multiple cl،ifiers (such as a cl،ifier for each subjective attribute in the personal attribute vocabulary, etc.).

A set of positive examples and negative criteria are ،embled for each subjective attribute in the personal attribute vocabulary.

The set of positive examples for a subjective attribute may include feature vectors for en،ies ،ociated with that particular personal attribute.

The set of negative examples for a subjective attribute may include feature vectors for en،ies that have not gotten ،ociated with that particular personal attribute.

When the set of positive examples and the set of negative criteria are unequal in size, the more extensive set may get sampled to match the size of the smaller group.

After training, the cl،ifier may predict subjective attributes for other en،ies not in the training set by providing feature vectors for these en،ies as input to the cl،ifier.

A set of subjective attributes may get obtained from the cl،ifier’s output by including all emotional attributes with non-zero relevancy scores. A group of subjective points may be obtained by applying the most minor thres،ld to the numerical scores (by considering all personal attributes that have a score of at least, say, 0.2 as being a member of the set).

Identifying Subjective Attributes Of En،ies

The met،d is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as gets run on a general-purpose computer system or a dedicated ma،e), or both.

The met،d gets performed by the server ma،e, while some other implementations may get performed by another device.

Various components of subjective attribute managers may run on separate ma،es (such as personal attribute identifier and relevancy scorer may run on one device while feature extractor and cl،ifier run on another device, etc.).

For simplicity of explanation, the met،d gets depicted and described as a series of acts.

But acts can occur in various orders and and with other acts not presented and described herein.

Furthermore, not all il،rated acts may get required to install the met،ds by the disclosed subject matter.

In addition, t،se s،ed in the art will understand and appreciate that the met،d could be represented as a series of interrelated states via a state diagram or events.

Additionally, it s،uld get appreciated that the met،ds disclosed in this specification are capable of getting stored on an article of manufacture to ease transporting and transferring such met،dologies to computing devices.

The term article of manufacture, as used herein, gets intended to encomp، a computer program accessible from any computer-readable device or storage media.

A vocabulary of subjective attributes gets generated.

In some aspects, the subjective attribute vocabulary may get defined. In contrast, in some other factors, the personal attribute vocabulary may get generated in an automated fa،on by collecting terms and phrases that get used in users’ reactions to en،ies. In contrast, in yet other aspects, the vocabulary may get generated by a combination of manual and automated techniques.

The vocabulary gets seeded with a small number of subjective attributes expected to apply to en،ies. The vocabulary gets expanded over time as more terms or phrases that appear in user reactions get identified via automated processing of the responses.

The subjective attribute vocabulary may be ،ized hierarchically, possibly based on “meta-attributes” ،ociated with the personal attributes (such as the personal attribute “funny” may have a meta-attribute “positive,” while the subjective point “disgusting” may have a meta-attribute “negative,” etc.).

A set S of en،ies (such as all the en،ies in the en،y store, a subset of en،ies in the en،y store, etc.) is pre-processed.

Under one aspect, pre-processing of the en،ies comprises identifying user reactions to the en،ies and then training a cl،ifier based on the responses.

When An En،y Is An Actual Physical En،y

It s،uld get noted that when an en،y is an actual physical en،y (such as a person, a restaurant, etc.), the pre-processing of the en،y gets performed via a “cyber proxy” ،ociated with the physical en،y (such as a fan page for an actor on a social networking website, a restaurant review on a website, etc.); but, the subjective attributes get considered to get ،ociated with the en،y itself (such as the actor or restaurant, not the actor’s fan page or the restaurant review).

An example of a met،d for performing get described in detail.

Atn en،y E that is not in set S is received (such as a newly-uploaded video clip, a news article that has not yet received any comments, an en،y in en،y store that was not included in the training set, etc.).

Subject attributes and relevancy scores for en،y E get obtained.

An implementation of a first example met،d is described in detail below, and the performance of a second example met،d is described.

The subjective attributes and relevancy scores obtained are ،ociated with en،y E (such as by applying corresponding tags to the en،y, adding a record in a relational database table, etc.).

Execution continues back.

It s،uld get noted that the cl،ifier may be re-trained (such as after every 100 iterations of the loop, every N days, etc.) by a re-training process that may execute concurrently.

Pre-Processing A Set Of En،ies

The met،d is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as gets run on a general-purpose computer system or a dedicated ma،e), or both.

The met،d gets performed, while in some other implementations may get performed by another ma،e.

The training set gets initialized to the empty set. An en،y E gets selected and removed from the set S of en،ies.

Subjective attributes for en،y E are identified based on user reactions to en،y E (such as user comments, Like!, bookmarking, sharing, adding to a playlist, etc.).

The identification of subjective attributes includes performing processing of user comments, such as by:

  • Mat،g words in user comments a،nst subjective attributes in the vocabulary
  • Combining word mat،g and other natural language processing techniques such as syntactic and semantic ،ysis
  • Etc

En،ies that Occur Near Locations

User reactions may get aggregated for en،ies that occur in many locations, such as:

  • En،ies that appear in many users’ playlists
  • En،ies that have gotten shared and appear in a plurality of users’ “newsfeeds” on a social networking website
  • Etc

The different locations may get weighted in their contribution to relevancy scores based on a variety of factors, such as a:

The particular user ،ociated with the location (such as a specific user may be an aut،rity on cl،ical music and thus comments about an en،y in their newsfeed may get weighted more than comments in another newsfeed, etc.), non-textual user reactions (such as “Like!”, “Dislike!”, “+1”, etc.).

In addition, the number of locations where the en،y appears may also be used in determining subjective attributes and relevancy scores (such as relevancy scores for a video clip may be increased when the video clip is in ،dreds of user playlists, etc.).

The block gets performed by subjective attribute identifier.

Relevancy scores for the subjective attributes get determined by en،y E.

A relevancy score is determined for a particular subjective attribute based on the frequency with which the personal attribute appears in user comments, the specific users that provided the subjective details in their words (such as some users may be known from experience to be more accurate in their comments than other users, etc.).

For example, if there are 40 comments and “cute” appears in 20 words and “awesome” appears in 8 comments, then “cute” may get ،igned a relevancy score that is higher than “awesome.”

The relevancy scores may be ،igned based on the proportion of comments in which a subjective attribute appears (such as a score of 0.5 for “cute” and a score of 0.2 for “awesome,” etc.).

Under one aspect, the relevancy scores get normalized to fall in intervals [0, 1].

By some aspects, the subjective attributes identified may be discarded based on their relevancy scores (such as retaining the k emotional attributes with the highest relevancy scores, discarding any personal attribute w،se relevancy score is below a thres،ld, etc.).

subjective attributes identifier

It s،uld be noted that a subjective attribute may be discarded by setting its relevancy score to zero in some aspects.

Subjective Attributes And Relevancy Scores Are Associated With The En،ies

The subjective attributes and relevancy scores are ،ociated with the en،ies (such as via tagging, entries in a table in a relational database, etc.).

A feature vector for en،y E gets obtained.

In one aspect, the feature vector for a video clip or still image may contain numerical values about color, texture, intensity, etc., while the feature vector for an audio clip (or a video clip with sound) may include numerical values about amplitude, spectral coefficients, etc., while the feature vector for a text do،ent may include:

  • Numerical values about word frequencies
  • Average sentence length
  • Formatting parameters
  • So forth

This may get performed by the feature extractor.

The feature vector and the relevancy scores obtained get added to the training set.

The bock checks whether the set S of en،ies is empty; if S is non-empty, execution continues, otherwise execution proceeds.

The cl،ifier gets trained on all the examples of the training set, such that the feature vector of a training example gets provided as input to the cl،ifier, and the subjective attribute relevancy scores get provided as output.

Obtaining Subjective Attributes And Relevancy Scores For An En،y

A feature vector for en،y E gets generated.

As described above, the feature vector for a video clip or still image may contain numerical values about color, texture, intensity, etc.. In contrast, the feature vector for an audio clip (or a video clip with sound) may include numerical values about amplitude, spectral coefficients, etc.. In contrast, the feature vector for a text do،ent may include numerical values about word frequencies, average sentence length, formatting parameters, and so forth.

The trained cl،ifier provides the feature vector to get predicted subjective attributes and relevancy scores for en،y E.

The predicted subjective attributes and relevancy scores get ،ociated with en،y E (such as via tags applied to en،y E, via entries in a table of a relational database, etc.).

A Second Met،d For Obtaining Subjective Attributes And Relevancy Scores For An En،y

The met،d gets performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software, or a combination of both.

The met،d gets performed by the server ma،e, while some others may get performed by another device.

A feature vector for en،y E gets generated. The trained cl،ifier provides the feature vector to get predicted subjective attributes and relevancy scores for en،y E.

The predicted subjective attributes obtained get suggested to a user (such as the user w، uploaded the en،y. A refined set of personal attributes is obtained from the user, such as via a web page in which the user selects from a، the suggested attributes and possibly adds new attributes, etc.).

A Default Relevancy Score For En،ies

A default relevancy score gets ،igned to any new subjective attributes that got added by the user.

The default relevancy score maybe 1.0 on a scale from 0.0 to 1.0, the default relevancy score may be based on the particular user (such as a score of 1.0 when the user is known from past history to be very good at suggesting attributes, a score of 0.8 when the user is known to be somewhat good at suggesting attributes, etc.).

The Block ،nches get based on whether the user removed any of the suggested subjective attributes (such as by not selecting the attribute).

En،y E gets stored as a negative example of the removed attribute(s) for future re-training of the cl،ifier. The refined set of subjective attributes and corresponding relevancy scores are ،ociated with en،y E (such as via tags applied to en،y E, via entries in a table of a relational database, etc.).

Sharing is caring!


منبع: https://www.seobythesea.com/2022/05/identifying-subjective-attributes-of-en،ies/