Visit for more related articles at Journal of Internet Banking and Commerce
E-Commerce is reaching a new stage of conspicuous and widely disseminated development also due to the recent challenges of COVID 19. Thus, the process of shopping includes a first decision problem quite critical to the success of any business: which site of e-commerce should be selected by the consumer?
This means that the selection of such site should be studied in terms of the satisfaction of the consumer when using it for shopping compared with other competitive sites rather on its descriptive features. The list of studies comparing and evaluating the E-Commerce sites is quite long but almost no attempts have been made to model the satisfaction function of the consumer and so they are not particularly relevant to study the competitive choice of sites by the consumer.
In this paper a new model of consumer satisfaction is proposed using an approach–TRIDENT-based on the Multi Attribute Utility Theory (MAUT) and the critical stage of estimating the weights of multiple criteria is solved using an original method OptionCards which avoids the shortcomings of more traditional surveys.
The utility function describing the satisfaction function concerning websites of E-Commerce is estimated using the OptionCards method for a group of young professionals and university students confirming a similar importance assigned to the three major criteria.
Such utility function was used to estimate the rating of the 14 major Portuguese websites of E-Commerce using the answers of a group of young professionals using E-Commerce and the overall score confirms their relative level of popularity.
E-commerce; Customer satisfaction; Multicriteria model; Optioncards method.
Why a Multicriteria Model to Evaluate Websites for E-commerce?
E-commerce is becoming a prevailing type of commerce which is based on the ubiquitous development of digital society and on the internet penetration as there are more than 2 billion digital buyers  and more than 250 M internet domains  including about 125 M domains belonging to the class of “dot com” . Several trends are quite promising 
and recent challenges as the pandemic spread of COVID19 are accelerating the adoption of e-commerce even by older consumers replacing traditional retail shopping commerce by online business and so the last estimates of e-commerce are sub-estimating its growth as they are related to 2019. Even so, the average penetration (percentage of population with 16 to 75 years old using e-commerce) for EU is about 71% and more than 80% is already achieved by States such as UK, Germany, Netherlands and the Nordic countries .
Therefore, the evaluation of websites promoting e-commerce has been studied by many authors as it is clear from the reviews published by Chiou et al. but unfortunately their main output is the presentation of long lists of features which are supposed to be relevant for Goisuch evaluation without any systemic or taxonomic structure as it is clear from the summary table [6-8].
Furthermore, some of these features are overlapping such as “usability” and “quality” and doubts remain about the procedures to evaluate each feature. An evaluation model of a website is not a neutral instrument because it may be constructed to assess the achieved level of Information and Communications Technology (ICT) professional skills as it seems to be the case of world best enterprises, Sorum et al. emphasizing functionality, design, originality, etc. or it may be considered as a key instrument to facilitate the use of e-commerce by consumers as it should be the case for the objectives of this paper. Therefore, in this case, the evaluation should reflect the customer satisfaction for the purpose of e-commerce .
The technical quality of the site may not even be highly correlated with the customer satisfaction as it was shown by Mangiaracina et al. studying the customer satisfaction for awarded websites in terms of their quality and concluding that there is “lack of correlation between website quality and user satisfaction”.
The evaluation instrument should be independent of the commercial contents because the purpose is to assess the ability to implement e-commerce rather than assessing the commercial strategy of the vendor but this distinction was not clear for some authors such as they suggest the inclusion in the website evaluation process, features such as price and promotion which belong to the evaluation of the commercial strategy rather than to the website.
The evaluation model of a website for e-commerce should represent the evaluation by the customer and so no valid model can be proposed without aiming to estimate the customer satisfaction in terms of the relevant attributes, under the perspective of the consumer and without assessing the relative importance assigned by the consumer to such attributes .
The previous published research has not formulated the evaluation of websites for e-commerce through a utility function modelling the customer satisfaction although the need to consider the level of satisfaction was pointed out by several authors. Actually, Mangiaracina et al. and Barnes et al. emphasized the need to study the customer journey experience but such study is just based on unweighted sums of scores . Between 150 and 170 parameters are required (!) making their estimation rather unfeasible. Recently, only the perspectives of information display, pages organization and graphical design were considered [12-16] and two other authors Rouyendegh et al. and Kaya et al. presented complex and fuzzy models using the AHP approach to compare alternatives sites but only based on pairwise comparisons and so the number of experts and alternatives was very small and subject to rank reversal [17-19].
Other authors such as surveyed the level of satisfaction for information websites concluding that “navigation and content” are important but no evaluation metrics was proposed and suggested that the customers weights should be studied but no procedure was developed and no further results were published along this line of research .
Thus, the final conclusions of “This absence of coordinated theory development causes the research in Internet marketing to appear haphazard and unfocused” and of “This study found that the existing literatures do to have any commonly agreed-upon standards or techniques for website evaluation” are well justified.
The Research Question and Methodologic Options
The main research question addressed by this paper is: How can the satisfaction of the customer are modeled for the purpose of evaluating websites of e-commerce? The adopted methodology is the Multi-Attribute Utility Theory (MAUT)  and the structure of the model is based on an additive linear function of multiple criteria and sub-criteria which have to be defined and constructed in order that all the relevant perspectives contributing for the level of satisfaction of a customer when is using an website for e-commerce will be captured and conveniently described and represented .
The proposed model will be constructed in terms of a system of attributes following a tree structure in order that each node the number of attributes will be reasonably small (e.g., n=3), to facilitate the estimation of weights by the customer as many authors of decision theory have shown that such estimation is much harder and less rigorous if n is higher . The evaluation model should adopt a scoring function so that the consumer can assess each attribute and the OptionCards method  is used to elicit the weights assigned by the customer to each attribute.
Summing up, the objective of this paper is to develop a multicriteria model representing the satisfaction of the customer due to using a specific website for E–Commerce in order that the estimated evaluation will represent such satisfaction.
The application of this model implies the estimation of the appropriate value function representing the satisfaction of the population of customers which are supposed to use the e-commerce website and such estimation depends on the type of population to which the website is dedicated .
The Proposed Model
The proposed model aims to describe the level of satisfaction of the customer for the target group using a specific website for e-commerce in terms of a tree-structure of most relevant attributes and so the following components of the model have to be proposed:
a) A tree-structure of criteria in order so that all relevant perspectives for the consumer will be considered.
b) Scoring functions for each attribute.
c) A satisfaction model using a linear additive function MAUT.
d) A procedure recommended to elicit attributes weights.
The proposed model follows the lines of the TRIDENT model meaning that at each node the number of attributes will not exceed 3 and that the weights functions will be studied in terms of the 2-dimensional representation of indifference lines between pairs of evaluated alternatives.
The scoring function for each criterion or sub-criterion adopts a Likert scale ranging from 1 (“very bad”) to 5 (“excellent”) and the aggregation of scores at each level is obtained by a linear additive function in terms of weights which have to elicited from the consumers.
These four components of the model are presented in the following sections.
1. The Major Attributes And Scoring Systems To Assess Consumer Satisfaction
1.1 Major attributes and scoring functions: The proposed three major attributes to describe the customer satisfaction are:
a) The quality of the website navigation experience;
b) The level of trust of the consumer to the whole commerce transaction;
c) The quality of information concerning the delivery logistic system.
These attributes as well as the sub-criteria within each one have been chosen after a review of existing literature European Union  which has identified these dimensions as the most critical ones to a successful experience of e-business.
1.2 On the navigation experience: Online commerce is based on the interaction of the potential consumer and a website which means that the traditional strolling along the main street and the consequent window shopping is replaced by the quite different experience of site opening, browsing and searching for the most appropriate product or services. This experience is particularly critical for the success of any online purchase because an unpleasant, tiring or cumbersome experience is responsible for the loss of the potential consumer and eventual “immunity” against the website in question.
In this domain the disparity of results is quite stunning, ranging from very high attractive levels to very unsuccessful results .
An analytical decomposition of these attribute can be suggested considering three sub attributes:
Registration easiness: A best practice  can be identified as requiring just the email and password leaving for a  further stage the collection of additional data, such as address, Tax Identification Number, Payment options, etc.
Therefore, the proposed descriptor, I1, describing the registration easiness is defined by number of items, N, and so the following scoring is proposed (Table. 1).
Table 1 Descriptor I1 (Registration easiness)
Search engine and advisory support: All websites include a search option, but the level of the semantic inference engine and its learning ability can range from very low intelligent algorithms to high quality expert systems.
Alternative indicators to evaluate the capabilities of each of expert system have been proposed but in this case a descriptor of their quality can be based on the ability of finding the keyword representing the desired object using incorrect variations of that keyword and measuring the distance between the two words (original word and incorrect variation) in terms of the number of incorrect letters. For example, assuming the consumer looks for a smartphone (keyword, x) and introduces the word “smarkfone” (search word, y) does the engine still provide the correct answer? The semantic distance between two words can be formulated in alternative ways but in this model the following definitions are adopted:
D(x,y) is the absolute semantic distance between the x word and the word y
A(y→x) is the number of letters to be added to y so that x will be obtained
S(y→x) is the number of letters to be subtracted from y so that x will be obtained
Therefore, the relative semantic distance d(x,y) can be determined by:
where L is the average dimension of the two word given by:
where: I(i) is the number of letters of the i word.
Consequently, the quality of the search engine will be studied in terms of the probability P of finding the correct keyword in terms of d(x,y) and usual statistical parameters can be used to describe such relationship such as its average, its variance, or its quantile for any statistical level.
This means that the descriptor of the quality of the search engine (I2(s)) will be expressed by a statistical measure of P for the specified level of d(x,y) measured by the following scale (Table. 2).
|P ≥ 80%||5|
|80%>P ≥ 60%||4|
|60%>P ≥ 40%||3|
|40%>P ≥ 20%||2|
|20%>P ≥ 0%||1|
Table 2 Descriptor I2(s) (Quality of the Search Engine)
The application of this measuring scale implies testing the website using random keywords which describe the search relevancy for the e-consumer including corrupted random variations of two letters in each of these keywords.
Another important feature concerns the ability to receive advisory support helping the consumer to make the best choice. This support can be given by different types and levels of instruments concerning the specific good or service being considered by the customer (Table. 3).
|A direct video call line, phone call line and an email contact allowing direct interaction between the customer and the staff+a menu of frequently asked questions (FAQ)||5|
|A direct phone call line and an email contact allowing direct interaction between the customer and the staff+FAQ||4|
|An email contact allowing interaction between the customer and the staff+FAQ||3|
|No advisory support||1|
Table 3 Descriptor I2(a) (Quality of the advisory support)
I2 can be given by:
Full specification of the product or service: The online costumer experience is very sensitive to these attributes because direct interaction between customer and service/ product does not exist any longer and therefore the website should replace the three-dimensional process.
The following descriptor and measuring scale are proposed (Table. 4).
|Full description of the product with at least 3 HD pictures and/or video||5|
|Full description of the product with less than HD pictures and no video||4|
|Incomplete description and one or more pictures and no video||3|
|Incomplete description without pictures or video||2|
|No description and no picture or videos||1|
Table 4 Descriptor I3 (Full specification of the product or service)
1.3 On the level of trust: Any commercial transaction implies an underlying contract which will be not accepted if there is no sufficient trust by the consumer on the merchant and on that specific transaction. Actually, the issue of trust is quite critical to the development of online commerce as there is no real physical experience of direct contact between buyer and seller.
Therefore, this major concept of trust should be considered under a threefold perspective:
1) Level of security of the website implying that the whole site is secure and not just the payment system avoiding leaks of information about the object or service to be purchased, passwords and registration data, etc.
2) Level of confidence on the transaction meaning that if the consumer is not satisfied, the transaction can be reversed easily and without additional costs for the consumer. This explains why the European Union adopted a Directive on online commerce imposing a minimal period of 14 days to give the consumer the possibility of reversing the transaction, returning the product, and being fully reimbursed. It should be noted that some marketplaces like eBay or Amazon have adopted even more favorable rules managed by the so called “dispute centers”. However, other marketplaces based on RPC adopt different rules, namely on the return cost of the product.
3) Level of security of the payment system assuring that no frauds or leaks of data will occur.
Thus, the proposed descriptors and measuring scales for these three perspectives are presented in (Tables. 5-7).
|If the whole website is a secured one (encrypted using Transport Layer Security (TLS))||5|
|If just some pages of the web site are secured||3|
|If no pages are secured||1|
Table 5 Descriptor I4 (Trust in the website (excluding the payment system))
|If reversing the transaction can be easily requested by the consumer through the site approved on less than 2 days and the consumer reimbursed also in less than 2 days without having to pay any returning costs.||5|
|If 5 if reversing the transaction can be easily requested by the consumer through the site approved on more than 2 days and less than 14 days and being the consumer reimbursed in the same period without having to pay any returning costs.||3|
|If reversing the transaction cannot be easily requested through the website and if the customer has to pay the return costs||1|
Table 6 Descriptor I5 (Trust in the transaction)
|If the payment system is fully secure and multiple options are offered including systems like ATM payment oy PayPal avoiding having to introduce bank card data||5|
|If the payment system is fully secure but no options avoiding the introduction of bank card data are included||3|
|If the payment system is not fully secure||1|
Table 7 Descriptor I6 (Trust in the payment system)
Table 5 Descriptor I4 (Trust in the website (excluding the payment system))
1.3 On the website information concerning logistics: This dimension is also quite critical because the purpose of the transaction is receiving the acquired product or service as soon as possible and without additional uncertainty. However, this model is devoted to the evaluation of the website rather than of the commerce process and so the object of this evaluation should concern the display of information concerning this dimension.
Therefore, this perspective can be evaluated according to the following three descriptors (Tables. 8-10):
|Full information about the time and transportation mode is presented before completing the purchase.||5|
|Partial information about the time and transportation mode is presented before completing the purchase.||4|
|Full information about the time and transportation mode is just presented when the purchase is completed.||3|
|Partial information about the time and transportation mode is just presented when the purchase is completed.||2|
|No information is presented||1|
Table 8 Descriptor I7 (Delivery time)
|If online real time tracking is available and notices are sent before the delivery||5|
|If tracking is available just near the delivery time and notices are sent before the delivery||4|
|If no tracking is available but notices are sent before the delivery||3|
|If no tracking is available and no notices are sent||1|
Table 9 Descriptor I8 (Tracking information)
|Full information about flexibility of delivery time is presented before completing the purchase.||5|
|Partial information about flexibility of delivery time is presented before completing the purchase.||4|
|Full information about flexibility of delivery time is just presented when the purchase is completed.||3|
|Partial information about flexibility of delivery time is just presented when the purchase is completed.||2|
|No information is presented||1|
Table 10 Descriptor I9 (Flexibility of Delivery Time)
2. Multicriteria Evaluation
2.1 The Trident Model
This model was proposed by and is based on an application of Multi-Attribute Utility Theory (MAUT) to a multicriteria tree based on three branches stemming from each node and subdividing the weight space into sub-areas with the same ranking of alternatives. In this case the main three criteria are:
and each of these nodes will be subdivided according to the presented sub-criteria. The evaluation tree is presented in (Figure. 1).
Therefore, for each node associated to three criteria, the MAUT evaluation function can be defined by:
Now, the application of this MAUT model at each node of the presented tree implies the estimation of the weights representing the values of each decision maker which is always a difficult problem to be solved. The most often adopted approach is based on enquires of the type “ Do you consider criterion j equally important, more or less important than j’?” so that estimates of the weights will represent such answers but the main criticism is that decision makers do not know exactly what does it mean to be equally, more or less important than. The method Option Cards proposed by adopts a complete different approach based on successive binary questions which are unequivocally understood by the decision maker avoiding the previous shortcomings of the traditional approach and thus it is presented in the next section.
2.2 The OptionCards Method
The weights selected by any decision maker can be represented by a point in a 2 dimension triangular space bounded by the axes corresponding to
The subset of points for which there is indifference I_ik between 2 alternatives is a straight line defined by:
and this straight line (Figure. 2) subdivides the space into two subspaces: Dik where U(i)>U(k) meaning that i is preferred to k and Dki meaning that k is preferred to i .
Each value of any arbitrated alternatives follows the adopted Likert scale.
Thus, if the decision maker selects Iik,Dik or Dki then information can be elicited about the subdomain containing his preference and so the OptionCards method is based on a sequence of questions comparing arbitrated alternatives so that the sub-domain containing the point belonging to S1,2 and describing the weights selected by the decision maker will be progressively narrowed.
Each binary question is presented by an Option Card, OC(1) to the decision maker including two alternatives (am,bm) who can select the option ((a>b),(a=b),(a<b)) and the next Option Card, OC(m+1) will be presented in terms of such answer.
The questions to be asked should concern the comparisons related to the subspaces indicated in Figure 3 where the indifference lines (R1,….,R6) and the subspaces ( D1,…,D8) are represented.
Thus, the sequence of questions to be asked is as follows:
1) OC(1): the indifference line to be considered, Iab corresponds to R1 and so Dab should include (D1∪D2∪D3∪D4) and Dba should include (D5∪D6∪D7∪D8). Furthermore, if the answer corresponds to Iab then the weights of the decision maker should satisfy the equation of R1.
The alternatives a,b can be freely arbitrated providing that they should respect the equation of Iab.
The adopted alternatives are:
a: U_a1=4; U_a2=1; U_a3=2
b: U_b1=1; U_b2=4; U_b3=2
2.1) If Dab from OC (1), then OC(2.1) should be based on the indifference line Iab represented by R2 and so Dab corresponds to (D1∪D2) and Dba corresponds to (D3∪D4). If the answer corresponds to the indifference, then the equation corresponding to R2 should be respected.
The arbitrated alternatives are:
a: U_a1=4; U_a2=1; U_a3=4
b: U_b1=1; U_b2=4; U_b3=1
2.2) If Dba from OC(1), then OC(2.2) should be based on the indifference line Iab represented by R3 and so Dab corresponds to (D5∪D7) or if Dba then one has (D6∪D8). If the answer corresponds to the indifference then the equation of R3 should be respected.
The arbitrated alternatives are:
a: U_a1=4; U_a2=1; U_a3=1
b: U_b1=1; U_b2=4; U_b3=4
3.1.1) If Dab from OC (2.1), then OC(3.1.1) should be based on the indifference line Iab represented by R4 and so Dab corresponds to D1 and if Dba then one has D2. If the decision maker chooses the indifference, then R4 should be respected.
The arbitrated alternatives are:
a: U_a1=3; U_a2=1; U_a3=3
b: U_b1=1; U_b2=3; U_b3=1
3.1.2) If Dba from OC (2.1), then OC(3.1.2) should be based on the indifference line Iab represented by R5 and so Dab corresponds to (D3) and if Dba then one has (D4). The equation related to R5 should be respected if the answer corresponds to indifference.
The arbitrated alternatives are:
a: U_a1=4; U_a2=4; U_a3=1
b: U_b1=1; U_b2=1; U_b3=4
3.2.1) If Dba from OC (2.2), then OC(3.2.1) should correspond to the indifference line Iab represented by R5 and so Dab corresponds to D5 and Dba to D7. The equation relate to R5 should be respected if the answer corresponds to indifference.
The alternatives corresponding to R5 were already arbitrated.
3.2.2) If Dba from OC (2.2), then OC(3.2.2) corresponds to the indifference line Iab represented by R6 and so if Dab then one has D6 and if Dba then one has D8. The equation related to R6 should be respected if the answer corresponds to indifference.
The arbitrated alternatives are:
a: U_a1=3; U_a2=1; U_a3=2
b: U_b1=2; U_b2=4; U_b3=3
Obviously, further subdivisions of the space S_1,2 can be progressively done using further Option cards.
The application of this method is quite easy as it just implies the presentation of the option cards and receiving the answers following the enquires tree presented in Figure 5.
Finally, the estimated of the weights of the decision maker is obtained by averaging the coordinates of the extreme points of the sub-domain identified and such coordinates are presented in Table. 11.
Table 11 Average coordinates of each subdomain
The proposed model is applied to the evaluation of the 14 most visited e-commerce websites of the Portuguese market, excluding booking.com because traveling was severely affected by A focus group of 43 young and qualified respondents was used to estimate the evaluation of each website according to the presented 9 sub-criteria and the estimated average scores obtained in 2020 are presented in (Table. 12).
Table 12 Website Evaluation sub-criteria
The average score of each sub-criteria is also presented in the last row showing that the full specification of products, the level of security of the payments as well as tracking information are highly rated by the respondents while the level of confidence of the transaction, the delivery time and its flexibility are subpar next to the other attributes (Figure. 4).
The coefficient of variation of the scores assigned by the respondents to each site according to each criterion, CV, is low and under 10% meaning that the limits, L, for the confidence interval (95% level) concerning the estimated mean, M, for each site and subcriterion are given by :
Confirming that a larger sample is not required.
A broader focus group of 131 respondents including not just young professionals but also university students (bachelor or master level), was used to estimate the criteria weights because the estimation of their preferences does not imply to be frequent consumers and so the students group was considered relevant and included in this sample. This study is carried out using the OptionCard Method implemented through an interactive software.
The following results were obtained:
a) The average weights for navigation, trust and delivery are 0,33; 0,34 and 0,33 respectively;
b) The estimated coefficients of variation are under 10% and so the confidence limits, L (95% confidence level) for the estimated means, M, are quite close to each other meaning: L=M.(1± 0.017).
c) The variation of weights in terms of scientific areas ranges from 0.31 to 0.37 (Table. 13).
Table 13 Criteria weights according to gender
d) The variation of weights in terms of scientific areas ranges from 0.28 to 0.38 (Table. 14).
|Major Scientific Domain||ÃÂ1||ÃÂ2||ÃÂ3|
|Management and Economics (B)||0.28||0.36||0.36|
|Social Sciences and Law (C)||0.3||0.32||0.38|
Table 14 Criteria weights according to scientific area
Meaning that the weights variations are not statistically significant. The overall evaluation of each website according to the preferences of each evaluator can be now estimated using the estimated averages of λ_1=0.33, λ_2=0.34 and λ_3=0.33, with the results presented in Table 15:
Table 15 Overall website evaluation
Finally, the relationship between such overall evaluation and the ranking of the number of visits of each site is presented in Figure 5 showing that the most visited websites deserve quite positive evaluations according to our model, ranging from 3.34 to 4.44 and that there is a positive slope of evaluation in terms of the ranking preference ( Figure. 5).
a) The evaluation of websites of e-commerce has been extensively studied but not giving the required attention to the consumer satisfaction function because the relative preferences of the consumers have not been conveniently studied.
b) The proposed model–Trident–is based on the MAUT and the proposed criteria cover the key dimensions of navigation, trust and delivery.
c) This model includes an original approach–The OptionCard Method–to facilitate the estimation of the weights assigned by the consumers to the proposed criteria and therefore the evaluation results of the websites are not independent of the studied consumers.
d) The proposed model is successfully applied to the evaluation of the 14 most popular websites of e-commerce in Portugal using aninteractive software to implement the OptionCard method and the results are also presented herein.
e) The application of the OptionCard Method was very well accepted by the 131 respondents because it takes no more than 6 minutes as it just requires the successive comparisons concerning the three levels of OptionCards.
f) The obtained results confirm that the most popular websites have quite positive global evaluations and that there is a general positive trend between such evaluations and the ranking position of each website.
g) Summing up, the proposed instrument is an appropriate metric tool to evaluate the websites for e-commerce in terms of the specific target group of consumers to be considered.
This research has been developed by COMEGI: Research Centre on Organizations, Markets and Industrial Management of the Universidades Lusíadas (Foundation Minerva) (http://comegi.ulusiada.pt/), CiTUA: Center for Innovation in Territory, Urbanism, and Architecture of IST-University of Lisbon (http://citua.tecnico.ulisboa.pt/), OPET: Portuguese Observatory of Technology Foresight (https://www.opet.pt/), APMEP: Portuguese Association of Public Markets (http://www.apmep.pt/) and VORTAL, Connecting Business https://www.vortal.biz/.
This project received continuous support of a Steering Group including the following colleagues: Ruben Assis (Vortal Portugal), José LuísArístegui (Vortal Spain), Pedro Coimbra (Fund for Municipalities Development), Marco Coelho (Municipality of Sintra), Ana Sá (IST, University of Lisbon), Gonçalo Mendes (OPET and APMEP), Vasco Moreira (OPET and APMEP) and JoãoBranco (Electricity of Portugal, EDPC) plus other invited colleagues from other corporations and municipalities. This Steering Group has been chaired by Luís Valadares Tavares and José Antunes Ferreira.