Research on morphology optimization of heavy-duty industrial robots based on Kansei engineering and artificial intelligence generated content technology

Research on morphology optimization of heavy-duty industrial robots based on Kansei engineering and artificial intelligence generated content technology

Related theories and methods

Shape grammar and CBA

Shape grammar (SG) is a design method based on shape change proposed by Stiny and Gips in 197255. The essence of SG is the technology of redesigning or modifying the initial shape with the help of rules56. Mature products, such as cars, mobile phones, and cameras, utilize SG to inherit family design genes57. For example, Michael et al. used Harley-Davidson motorcycles as an example to establish the connection between brand and shape design by encoding shape rules58. Mccormack et al.59 also proved the advantages of SG in maintaining product recognition and appearance innovation through the analysis of Buick front design. They realized the innovative design of the inner cover of vehicles through interactive rule application and used SG to generate structures that are both functional and aesthetically pleasing60. Zhu et al.61 used SG to conduct in-depth research on the design of BYD automobile derivatives. SG requires a rich understanding of professional knowledge when dealing with complex parameter encoding. Although some scholars use Rhino software to process the complex parameter encoding, computer and mathematical skills are still required.

Generally, in product design, morphological element curves are the key to achieving morphological blending and morphological synthesis62. Blending is the process of generating an intermediate contour sequence that combines the geometric features of source shapes. This process is achieved by establishing the correspondence of feature points and interpolating between the contour curves. Critically, the resulting sequence must be manufacturable and analyzable63. Unlike the image space deformation that focuses on visual effects, this geometric-level two-dimensional contour curve blending utilizes the morphological element curve as the basic unit, directly serving the morphological synthesis and shape grammar construction in industrial product design64. To achieve curve blending, Chen and Parent65 introduced three methods for establishing the corresponding feature points of two curves: middle area, ray shooting, and minimum distance. Subsequently, Sederberg and Greenwood66 proposed a simple and effective curve blending method based on the area middle method. Hsiao and Chuang proposed a curve blending method based on the ray shooting method, which can obtain morphological curves using different blending algorithms. In a recent study, Hsiao et al.67 proposed a hybrid method based on ray casting to demolish and reconstruct morphological curves to maintain morphological features. Based on the above research, when the corresponding characteristic points of the two morphological element curves are established, it can be concluded that there are four types of CBA available for selection50. This method places the two curves in a two-dimensional coordinate system and uses the coordinate values of the feature points corresponding to the two curves as calculation parameters. The optional formulas are as shown in equations (1) to (4).

  1. 1.1.1.1.1.

    Weighted arithmetic mean method:

    $${\text{C}=\text{w}}_{1}\times {\text{g}}_{1}+{\text{w}}_{2}\times {\text{g}}_{2}$$

    (1)

  2. 2.2.2.2.2.

    Weighted geometric mean method:

    $$\text{C}=\sqrt{{{\text{g}}_{1}}^{{\text{w}}_{1}}+{{\text{g}}_{2}}^{{\text{w}}_{2}}}$$

    (2)

  3. 3.3.3.3.3.

    Weighted harmonic mean method:

    $$\text{C}=\frac{1}{\frac{{\text{w}}_{1}}{{\text{g}}_{1}}+\frac{{\text{w}}_{2}}{{\text{g}}_{2}}}$$

    (3)

  4. 4.4.4.4.4.

    Generalized weighted mean method:

    $$\text{C}= {\left[{\left({\text{w}}_{1}\times {\text{g}}_{1}\right)}^{\alpha }+{\left({\text{w}}_{2}+{\text{g}}_{2}\right)}^{\alpha }\right]}^{1/\alpha }$$

    (4)

In the formula, g1 and g2 are the feature point sets of the two curves, w1 and w2 are the weight values of the feature point sets, α is the attitude parameter value, and C is the new curve generated by the mixture.

Ray-firing method

The ray-firing method was initially limited to morphological transitions between two unexpanded curve segments. This study extended this method to arbitrary closed curves and constructs an innovative framework for the morphological optimization of industrial robots. This study demonstrates that the Ray-firing method is a powerful tool for innovative design, capable of retaining a brand’s original visual characteristics to ensure brand consistency and preserve the value of brand assets in new industrial robot designs50,68.

Step 1: Use continuous cubic Bezier curves to draw two morphological curves (such as curve a and curve b). And obtain the feature point sets of the two curves, namely {a1, a2, a3,…, a10} and {b1, b2, b3,…, b12}. As shown in Figure 1(a), curve a has 10 feature points and curve b has 12 feature points.

Fig. 1
figure 1

Fusion process using the Ray-firing method; (a) Two curves (Curve a and Curve b) and their set of characteristic points; (b) Schematic diagram of the Ray-firing method (c) Five alternative plans generated by curves a and b.

Step 2: Before blending the two curves, the two curves must have the same number of feature points. First, substitute the coordinate values of the feature points into formula (5) to calculate the centroid coordinates of curve a and curve b based on the centroid and overlap, as shown in Figure 1(b). Second, curve b is the reference curve of curve a. And use the centroid as the origin to let the ray pass through the feature points of curve a. Finally, take the 12 intersection points (red points of curve a) and the ray as the new feature points of curve a, namely {a 1’, a 2’, a 3’,…, a 12’}.

$$\overline{{x }_{c}}=\frac{{\sum }_{i=1}^{n}{m}_{i}{x}_{i}}{{\sum }_{i=1}^{n}{m}_{i}}{,\overline{{y }_{c}}=\frac{{\sum }_{i=1}^{n}{m}_{i}{y}_{i}}{{\sum }_{i=1}^{n}{m}_{i}},wherem}_{i}=1$$

(5)

where \(\overline{{x }_{c}}\) and \(\overline{{y }_{c}}\) are the centroid coordinates, n is the number of feature points, xi and yi are the feature point coordinates.

Step 3: When the base curve and the reference curve have the same number of feature points, the new curve can be constructed using any formula in equations (1) to (4). For example, in the xy coordinate system, the feature point coordinates of the base curve, the new feature point coordinates of the reference curve, and the weight ratios of the five feature point sets (i.e., 3:1, 2:1, 1:1, 1:2, 1:3) are substituted into formula (1), i.e., plan 1, plan 2, plan 3, plan 4, and plan 5, and five curves can be constructed, as shown in Figure 1(c).

DeepSeek

DeepSeek is a high-performance large language model built with an advanced distributed training architecture69. In KE, DeepSeek can automatically generate high-fidelity affective emotional imagery adjectives based on users’ written prompts, providing scalable corpora for product emotion computing and significantly enhancing the consistency of intelligent assistants in dialogue and task processing. Recently, with its outstanding capabilities in large language models, DeepSeek has become one of the rapidly developing important applications70. DeepSeek’s continued progress is driven by continuous optimization, adaptation to various application scenarios, and frequent updates, such as the release of DeepSeek-v3 in October 2024.

Similar to text-to-image generation tools, prompt engineering is crucial to DeepSeek. On the Internet, many researchers and enthusiasts are actively engaged in prompt engineering practice71. They focus on discussing and sharing effective techniques for accurately using DeepSeek and exploring its functional limits. The key elements of DeepSeek prompt words include context, instructions, output indicators, and input data. For example, “I am conceiving an industrial robot of a Chinese enterprise brand (context). Please give the design concept (description), within 200 adjectives (output indicators), covering functions, appearance, and materials (input data).” At the same time, there are practical tips, such as “Please analyze as an expert (role)”, “Please imitate the style of (person or brand)”, and “Organize the following information in the specified format”.

In the field of natural language processing, DeepSeek-v3 and GPT-4.1 are two highly representative advanced language models72. GPT originated in the United States, and DeepSeek originated in China. The following is an in-depth analysis of the two from multiple key dimensions. The similarities and differences between the two are shown in Table 1. DeepSeek-v3 and GPT-4.1 are powerful multimodal language models, each with its own advantages73. In the field of industrial design, DeepSeek has great potential and is suitable for brainstorming, conceptual design, and competitive product analysis. DeepSeek can generate the prompts required for image generation artificial intelligence74, so designers must master prompt engineering. Moreover, DeepSeek originated in China and excels at understanding descriptive vocabulary at various levels of abstraction, thanks to its excellent language processing capabilities. It aligns with the needs of Chinese companies, which makes it closely connected to KE theory and image generation tools such as Midjourney and Stable Diffusion.

Table 1 Comparison between DeepSeek-v3 and GPT-4.1.

KE

KE is the systematic way of turning consumers’ feelings and emotional needs into concrete product features and functions75. Within this framework, phone interviews with business users play six key roles. First, they act like an emotional radar, quickly and cheaply cutting through company layers to capture what decision-makers really feel about product performance and brand image. Second, they serve as a semantic anchor, translating vague feelings into clear, countable words through semi-structured calls, giving the first raw data for later semantic-difference matrices. Third, they work as a dimension probe: by transcribing, coding, and counting words, we uncover hidden emotional angles and enlarge the space for principal-component analysis76. Fourth, they are a weight checker: short follow-up calls test how strongly each keyword resonates, correcting any mismatch between what people say and what the numbers say. Fifth, they act as an iteration trigger, feeding fresh market trends, competitor notes, and usage scenes into a live update loop that keeps generative models retraining. Sixth, they build a trust bridge, using frequent, low-friction contact to keep key users engaged so later prototypes and data flow run smoothly77. Phone interviews are the doorway to measuring emotions in KE and the live tuning knob that runs through every stage of user-centred innovation.

The standard five-step workflow is as follows. Step 1: Zero in on product pain points and core decision-makers, design a semi-structured, scenario-based guide, and run small-sample, in-depth interviews to catch spoken feelings in real time. Step 2: Transcribe the calls, build a multidimensional Kansei rating matrix using the semantic-differential method, code the raw words, count frequencies, and check reliability. Step 3: Use principal-component analysis to pull out the hidden kansei dimensions, pick statistically strong and highly distinctive keywords across three levels of emotional needs, and link them to quantifiable design variables like colour, material, and form78. Step 4: Break down key elements with morphological grammar, compute how much each variable shapes the intended image, and let large models quickly output prototypes that match the target feelings79. Step 5: Validate the prototypes’ emotional impact with follow-up calls, fold user feedback and market data into a dynamic update loop for continuous model improvement. In this way, phone interviews transform abstract feelings into actionable design parameters, establishing a method that is both theoretically sound and practically workable for user-centered industrial innovation80. At the same time, the resulting labels and weight maps also supply high-quality, scalable prompts for AIGC-driven design.

FAHP

In 1987, Saaty proposed the analytic hierarchy process (AHP). It is a solution to decision problems involving multiple criteria81. Based on the fuzzy comprehensive evaluation method and the analytic hierarchy process, scholars proposed the fuzzy analytic hierarchy process, namely FAHP82. This method can effectively address the error problem caused by the traditional AHP method’s susceptibility to individual extreme values and respondent subjectivity83. Currently, FAHP has been applied to decision problems related to product design. The operation steps of FAHP are roughly the same as those of AHP84. The implementation steps are described as follows85.

Step 1: Decompose the decision problem into the target layer, the guidance layer, and the indicator layer. Obtain user needs through questionnaire surveys and on-site interviews, filter and summarize them, and then build a model.

Step 2: Construct a fuzzy judgment matrix. Compare the indicators at the same level pairwise to determine the relative importance. The fuzzy judgment reference is shown in Table 2. Experts score the indicators according to the table and construct the fuzzy judgment matrix Q:

$$Q= {\left({r}_{ij}\right)}_{n\times n}=\left[\begin{array}{c}\begin{array}{cc}{r}_{11}& {r}_{12}\\ {r}_{21}& {r}_{22}\end{array} \begin{array}{cc}\cdots & {r}_{1n}\\ \cdots & {r}_{2n}\end{array}\\ \begin{array}{cc}\vdots & \vdots \\ {r}_{n1}& {r}_{2n}\end{array} \begin{array}{cc}\ddots & \vdots \\ \cdots & {r}_{nn}\end{array}\end{array}\right]$$

(6)

In the formula, rij represents the comparison between factor ri and factor rj, i, j = 1, 2, …, n.

Step 3: Calculate the indicator weight by the summation method and other methods. This method is simple and has high stability. The formula is as follows:

$${w}_{i}=\frac{{\sum }_{i,j=1}^{n}{r}_{ij}+\frac{n}{2}-1}{n(n-1)}$$

(7)

In the formula, Wi is the weight of indicator ri, and n is the order of the matrix.

Step 4: Calculate the consistency ratio (CR). If CR ≤ 0.1, the judgment matrix is considered to have good consistency. The calculation formula is as follows:

$$CI=\frac{{\lambda }_{max}-n}{n-1}$$

(8)

CR calculation formula is as follows:

The random index (RI) values corresponding to matrices of different orders can be obtained by looking up Table 3.

Table 3 Determine the RI values of the 1 st to 9th orders of the matrix.

Step 5: Weight ranking of demand indicators. Through comprehensive weight calculation, determine the weight ranking of each indicator relative to the target layer, and perform a consistency check again.

Implementation process of the design method based on KE and AIGC technology

This study aimed to develop a morphological optimization design method for HDIRs based on KE, CBA, and AIGC technology and verify its effectiveness. This method optimizes the investigation, artificial generation, Curve blending, and visual output stages of the HDIRs design process. The complete implementation steps of the proposed design method are as follows.

Design and investigation stage

Step 1: Use the channel of telephone interviews with enterprise users to widely collect users’ adjective descriptions of products to form a perceptual vocabulary collection. Subsequently, the vocabulary was screened and analyzed to eliminate uncommon and difficult adjectives. The vocabulary structure was optimized by merging, splitting, and reorganizing to ensure that the vocabulary was concise and efficient. This structure enhances the accuracy and practicality of the vocabulary database, providing reliable language tool support for perceptual cognition research. First, through the traditional method of perceptual engineering, the emotional imagery adjectives of the HDIRs’ form were collected by enterprise telephone interviews. The user’s perceptual needs for the HDIRs form were identified, and then DeepSeek was used to generate the emotional imagery adjectives of the HDIRs form. Specifically, the latest version of DeepSeek-v3 was adopted to generate a set of emotional imagery adjectives suitable for describing the form of the target product. To obtain objective and emotional imagery adjectives when querying DeepSeek, practical tips and tricks (from the DeepSeek official website) can be used. For example, use Tip 1: Assign an industrial role to DeepSeek. Tip 2: Provide background information for the question. Tip 3: Ask questions precisely. Tip 4: Advance the question systematically. Tip 5: Guide DeepSeek to a reasonable and logical response. Tip 6: Provide feedback to DeepSeek. This step will generate many emotional imagery adjectives that describe the target product.

Step 2: Select typical adjectives from the emotional imagery adjectives generated in step 1. In view of the uncertainty of AI text generation, designers and consumers are invited to participate in the selection of typical emotional imagery adjectives. Enhance the objectivity and rigor of the screening, and finally determine the top-ranked adjectives as typical emotional imagery adjectives. The other subphrases generated usually cover the theme (target product), background settings, rendering effects, image quality, and various control parameters.

AIGC design stage

Step 3: Enter emotional imagery adjectives and prompt sub-phrases in Midjourney. To ensure that the generated form conforms to the typical emotional imagery adjectives, other subphrases must avoid showing image tendency. Regarding the generation of the product database, the “MJ version” in the Midjourney version should be selected. Subsequently, multiple reference prompt words containing the core elements of the product are constructed again through DeepSeek. Combined with the description skills of the general parameters of the product generated by uploading the image-generated text-describe attribute of the basic form of the product. And through systematic parameter optimization experiments, the optimal prompt word structure is finally determined. In Midjourney’s “/imagine” command, enter the typical emotional imagery adjectives followed by the other necessary sub-phrases in sequence to generate a set of target product forms corresponding to the typical emotional imagery adjectives.

Step 4: Use the images generated by Midjourney to establish a reference database for the target product. The reference database consists of first-generation images and second-generation images. Specifically, the first use of the “/imagine” command to generate the first-generation database of the target product produces four images. Based on each first-generation order, running the “/imagine” command again will produce four additional images. There are slight differences between them, and they are considered second-generation databases. Additionally, the image tendency of the second-generation form can be adjusted via the “Remix” mode and the “-chaos” parameter.

Curve mixing design stage

Step 5: Update the basic image. The target image is established based on the typical emotional imagery adjectives of the target product, and serves as the reference standard for updating the basic image. Subsequently, based on the set target image, select the corresponding image from the database as the reference image. Finally, the element curves of the basic image and the reference image are drawn using cubic Bezier curves, and the two form curves are normalized according to the overall width or height of the shape.

Step 6: Invite experts to use morphological analysis to deconstruct the product’s appearance, allowing several design features to be decomposed. The basic image and the reference image are decomposed into a combination of various element curves. The computer-aided design tool Rhino is used to display the number of feature points and coordinate values of each element curve. Subsequently, the coordinate values of the feature points are substituted into formula (5) to calculate the centroid of each form element curve.

Step 7: Generate Blended Curves. First, overlay the paired morphological element curves from the basic and reference images, aligning them by their centroids. Then, use the CBA and Ray-firing method to blend the curves. By adjusting the feature point weights (w1, w2) as defined in formulas (1) through (4), multiple new morphological element curves can be generated from each initial pair.

Step 8: Use equation (5) to calculate the centroid coordinates of the newly generated curves. Subsequently, using the reference form of the centroid as the reference, the newly generated element curves are combined to form multiple new complete forms. During the combination process, a morphology should be composed of element curves created with the same mixing ratio. Finally, several complete morphological curves are combined as new plans for the target product.

Visual output and verification stage

Step 9: Use the image generation Stable Diffusion graph generation module and controlnet to control the conversion of the morphological curve of the new plans into a three-dimensional rendering.

Step 10: Evaluate the new plans through expert discussions based on FAHP and consumer perception questionnaires. The purpose of the evaluation is to verify whether the new plans are consistent with the image of the reference form (i.e., the target image) and to determine which new plan best matches the target image. Additionally, it is essential to verify the consistency and statistical significance of the two evaluation results.

Methodological framework

Based on the design method and implementation procedures outlined in Section “Implementation process of the design method based on KE and AIGC technology”, the research framework is shown in Figure 2. The implementation process of the generative artificial intelligence-based method for the morphology optimization design of HDIRs is described. The specific implementation process is as follows.

Fig. 2
figure 2

Methodological framework.

link

Leave a Reply

Your email address will not be published. Required fields are marked *